ProgrammingBackend Developer

What are iterators, generators, and the syntax of yield in Python, how are they related, and why is yield important for efficient handling of large data?

Pass interviews with Hintsage AI assistant

Answer.

Iterators and generators are the basis for efficient handling of sequences in Python. Historically, Python aimed to simplify working with data streams and to avoid excessive storage of large collections in memory. First, support for iterators appeared through the __iter__ and __next__ protocols, and then generators emerged, allowing the creation of simple iterators using constructs based on yield.

Problem: Often, there is a need to process large volumes of data (for example, streaming from a file or a database), which cannot be done conveniently and efficiently if everything is loaded into memory at once. Regular functions return all results at once, and creating custom iterators through classes is often too cumbersome for simple cases.

Solution: The yield mechanism allows for "lazy" generation of data. A generator function does not return a list or another collection; it returns a generator object — an iterator that computes values on demand.

Example code:

# Simple generator def countdown(n): while n > 0: yield n n -= 1 for i in countdown(3): print(i) # 3, 2, 1

Key features:

  • Memory efficiency: data is created on request, not in advance.
  • Simplicity of syntax: yield implements a full-fledged iterator in just a few lines of code.
  • Execution control: generators maintain state between calls.

Tricky questions.

Can you use return and yield in the same function?

Yes, but return in a generator ends the iteration (raises StopIteration), while yield can be used as many times as needed.

def example(): yield 1 return # StopIteration

Why can't a generator be "restarted" after completion?

A generator cannot be restarted after the iterations end (StopIteration); it needs to be created anew.

gen = countdown(2) list(gen) # [2, 1] list(gen) # [] (the generator is exhausted)

What is the difference between a generator and an iterator?

A generator is a special case of an iterator; any object with the iter and next methods is an iterator, but a generator is created through a function with yield.

Common mistakes and anti-patterns

  • Forgetting that generators are "one-time use": after exhaustion, they cannot be reused.
  • Confusing the workings of return and yield inside generator functions.
  • Storing generation results in a list — losing the meaning of lazy computations.

Real-life example

Negative case

A developer wrote a function that loads a million lines of a file into memory using list(fd) for analysis. This led to memory overflow on the server.

Pros:

  • Fast availability of all lines.

Cons:

  • High memory consumption.
  • Possible crashes due to lack of memory.

Positive case

Using a generator for line-by-line reading of a file (one line at a time) and analyzing data on the fly with yield.

Pros:

  • Minimal memory usage.
  • Can work with files of any size.

Cons:

  • Cannot access already read data without additional storage.