ProgrammingPython Developer (middleware, backend, testing)

How to properly implement collection copying in Python, in which cases is copy() insufficient and why, and when is deep copy necessary?

Pass interviews with Hintsage AI assistant

Answer.

The history of the issue: In Python, copying objects (especially collections — lists, dictionaries) is critical: simple assignment to a variable creates a new reference, not a copy. Special methods copy() and deepcopy() were introduced to avoid unwanted "side effects" when sharing structures.

Problem: When dealing with nested collections (list of lists, dictionary within a dictionary), simple copy() replicates only the container itself but not the inner elements. This can lead to hard-to-trace bugs: changing a nested element reflects in all "copies".

Solution: copy.copy() creates a shallow copy — a new top-level container, but the nested objects remain the same. copy.deepcopy() recursively copies all nested objects.

Code example:

import copy lst = [[1, 2], [3, 4]] shallow = copy.copy(lst) deep = copy.deepcopy(lst) lst[0][0] = 10 print(shallow) # [[10, 2], [3, 4]] — the nested object changed! print(deep) # [[1, 2], [3, 4]] — deepcopy preserved the original state

Key features:

  • Simple assignment copies only the reference, no duplication
  • .copy() (or copy.copy()) copies the "outer" container but not the nested objects
  • .deepcopy() is used for complete independence of copies

Trick questions.

Is it sometimes enough to write lst2 = lst[:] for copying a list?

lst2 = lst[:] creates a shallow copy of the list, but nested objects (e.g., lists inside the list) will still be the same. For flat lists — this is sufficient; for nested structures — it is not.

Example:

lst = [[1], [2]] lst2 = lst[:] lst[0][0] = 99 print(lst2) # [[99], [2]] — the nested element changed in both "copies"

Does the .copy() method work the same for all collections?

No. For example, dict.copy() works as shallow copying, list.copy() appeared only in Python 3.3, for set — there is set.copy(). For user-defined objects, support for .copy() depends on the implemented method.


Can deepcopy be used everywhere? Is it safe and efficient?

No. deepcopy is an expensive operation: it may cause performance issues, especially on large structures, and it fails on non-copyable/capturing or non-standard objects. Use deepcopy only when a fully independent, recursive copy is truly needed.


Common mistakes and anti-patterns

  • Using regular assignment or .copy() instead of deepcopy for nested structures
  • Applying deepcopy to all objects successively (slows down the program)
  • Attempting to copy objects that do not support copying (e.g., files, streams)

Real-life example

Negative case

In tests, copying a "dictionary of dictionaries" was done via dict.copy(). Because of this, changes in the nested structure for one user unexpectedly affected other tests (data mutated globally).

Pros:

  • Fast
  • Simple

Cons:

  • Non-obvious bugs, breaking environment isolation

Positive case

Adopted copy.deepcopy(), documented the nesting level and structure specifics of each object, avoided deepcopy for large volumes, where possible implemented "manual" partial copying.

Pros:

  • Transparency of the environment
  • Independence of objects

Cons:

  • Requires understanding of the structure
  • Much higher memory and CPU usage for large structures