Generators are special iterable objects in Python that allow you to create sequences "on the fly" without occupying memory for the entire collection at once. They are implemented using functions with the yield keyword or generator expressions ((expr for ... in ...)). This is convenient when working with large volumes of data or potentially infinite streams.
Key differences from list comprehensions:
[x for x in range(10)]) create the entire list in memory immediately.(x for x in range(10))) produce elements one at a time, consuming much less memory.When to use generators:
# Generator function def counter(n): for i in range(n): yield i for number in counter(5): print(number)
"What is the difference between using a function with yield and a regular function that returns a list? Provide an example."
Answer:
A regular function computes and returns the list immediately, occupying memory for all its elements. A function with yield returns a generator that produces elements one at a time without loading the entire sequence into memory at once.
def make_list(n): return [i for i in range(n)] # Returns a list immediately, consumes a lot of memory def make_generator(n): for i in range(n): yield i # Will produce one element at a time
Story
In a project for analyzing large logs, list comprehensions were used to extract lines containing errors:
error_lines = [line for line in open('biglog.txt') if 'ERROR' in line]
The file exceeded 2 GB and the application crashed with OOM (Out of Memory). A generator should have been used:
error_lines = (line for line in open('biglog.txt') if 'ERROR' in line)
Story
An employee wanted to analyze a short list, wrote a function with yield, but forgot that a generator is returned instead of a list:
result = my_generator_function() # result is a generator, not a list if len(result) > 5: # TypeError: object of type 'generator' has no len()
Correction: wrap the result in list().
Story
Tried to iterate over a generator multiple times:
numbers = (i for i in range(5)) for n in numbers: pass # exhausted the generator for n in numbers: print(n) # prints nothing
A generator is single-use. You need to create a new one for reuse.