ProgrammingPython Developer

Explain what generators are in Python. How do they work, what are they used for, and how do they differ from list comprehensions?

Pass interviews with Hintsage AI assistant

Answer.

Generators are special iterable objects in Python that allow you to create sequences "on the fly" without occupying memory for the entire collection at once. They are implemented using functions with the yield keyword or generator expressions ((expr for ... in ...)). This is convenient when working with large volumes of data or potentially infinite streams.

Key differences from list comprehensions:

  • List comprehensions ([x for x in range(10)]) create the entire list in memory immediately.
  • Generators ((x for x in range(10))) produce elements one at a time, consuming much less memory.

When to use generators:

  • If you don't need index-based access to elements.
  • If the data is too large to fit in memory.
  • When organizing streaming data processing (e.g., reading lines from a file).
# Generator function def counter(n): for i in range(n): yield i for number in counter(5): print(number)

Trick question.

"What is the difference between using a function with yield and a regular function that returns a list? Provide an example."

Answer: A regular function computes and returns the list immediately, occupying memory for all its elements. A function with yield returns a generator that produces elements one at a time without loading the entire sequence into memory at once.

def make_list(n): return [i for i in range(n)] # Returns a list immediately, consumes a lot of memory def make_generator(n): for i in range(n): yield i # Will produce one element at a time

Examples of real errors due to ignorance of the topic's subtleties.


Story

In a project for analyzing large logs, list comprehensions were used to extract lines containing errors:

error_lines = [line for line in open('biglog.txt') if 'ERROR' in line]

The file exceeded 2 GB and the application crashed with OOM (Out of Memory). A generator should have been used:

error_lines = (line for line in open('biglog.txt') if 'ERROR' in line)

Story

An employee wanted to analyze a short list, wrote a function with yield, but forgot that a generator is returned instead of a list:

result = my_generator_function() # result is a generator, not a list if len(result) > 5: # TypeError: object of type 'generator' has no len()

Correction: wrap the result in list().


Story

Tried to iterate over a generator multiple times:

numbers = (i for i in range(5)) for n in numbers: pass # exhausted the generator for n in numbers: print(n) # prints nothing

A generator is single-use. You need to create a new one for reuse.