"On-demand" computations or lazy evaluations became popular with the increase in the volume of data being processed. In Python, such mechanisms were implemented in the standard library through generators and iterators, later via the itertools function and classes capable of yielding one element at a time on request, avoiding the need to store all data in memory at once.
Typical collection structures require loading the entire result into memory. If the volume is large, the program may "crash" or run very slowly. It is important to be able to handle data streams — for example, files of many gigabytes or results from API queries.
Lazy evaluations allow for obtaining elements as needed. In Python, this is facilitated by the use of generators, the yield syntax, generator expressions, and the functions map, filter, zip, as well as the itertools module. This approach is based on the iterator protocol.
Example code:
def huge_sequence(): for i in range(1, 10**9): yield i * i for val in huge_sequence(): if val > 100: break print(val)
Key features:
Do generators in Python always save memory?
Answer: No, only if the data does not actually require intermediate storage between steps. Some constructs, such as list comprehensions, create the entire list at once, while generators yield only on request. If intermediate results are still needed, the savings are lost.
Example:
squares = (x**2 for x in range(10**8)) # lazy, memory-efficient result = list(squares) # instantly consumes all memory
Is it true that map and filter always return lists?
No, in Python 3, map and filter return not a list, but an iterator (lazy generator), which saves memory and allows for processing data "on the fly."
Can a generator be iterated over multiple times?
No, a generator "exhausts" after being fully iterated. If repeated passes are needed, a new generator should be created or a container collection should be used, the contents of which can be traversed multiple times.
A developer attempts to process a large log file by loading it into memory as a list of strings.
Pros:
Cons:
A generator is used — reading the file line by line with processing of each line as it is received.
Pros:
Cons: