ProgrammingMiddle Python Developer

How does the 'in' operator work for user-defined objects in Python? What needs to be implemented in the class for the expression 'x in your_obj' to work? How can performance issues and unexpected errors be avoided?

Pass interviews with Hintsage AI assistant

Answer.

The in operator in Python checks whether an element is in a collection. For user-defined objects to support the expression x in your_obj, it is necessary to implement the method __contains__. If it is not present, the interpreter will attempt to iterate the object using __iter__ or __getitem__, but the behavior and performance may differ.

Example:

class MyBag: def __init__(self, items): self.items = items def __contains__(self, value): return value in self.items bag = MyBag([1,2,3]) print(2 in bag) # True print(5 in bag) # False

If only __iter__ (or even just __getitem__) is implemented, in will work, but less efficiently and sometimes not as expected.

Note: if the collection is huge and the check is implemented naively (for example, by looping through the entire list), there could be performance issues. For fast lookups, sets are often used.

Trick question.

Is it sufficient to implement only __iter__ or only __getitem__ for the correct operation of the in operator? How will the behavior change?

Answer:

  • If __contains__ is not present, Python will try to iterate the elements using __iter__ (if present) or __getitem__ (starting from index 0 until IndexError is raised).
  • Such behavior is less efficient and can cause infinite loops or exceptions if the methods are implemented with typos.

Example:

class Weird: def __getitem__(self, idx): if idx < 3: return idx raise IndexError w = Weird() print(2 in w) # True print(5 in w) # False

Examples of real errors due to ignorance of the nuances of the topic.


Story

In one project, a user-defined container for storing entities only overrode __iter__, forgetting to implement __contains__. The in operator began to work not only slowly (for large collections, lags were noticeable), but also started to fail with mysterious errors when the iterator incorrectly raised exceptions that were not of type StopIteration.


Story

For a class where elements were calculated "on the fly" by index, the developer only implemented __getitem__. Attempting to check x in obj with a large x resulted in long loops and even Out Of Memory — because in checks all indices in ascending order until it encounters an IndexError.


Story

In one of the projects, a custom dictionary was implemented that relied solely on __iter__ for in. This led to searches taking seconds for 100,000 keys compared to milliseconds for the standard dict (where __contains__ is implemented efficiently).