Background
The zip() function was introduced in Python as a convenient way to "zip together" multiple sequences into tuples with corresponding elements, creating a sort of "matrix" structure.
Problem
An error occurs when developers use zip for sequences of different lengths, expecting the result to be based on the longest one, or incorrectly use unpacking during the reverse transformation.
Solution
zip() takes one or more sequences and returns an iterator of tuples, where the n-th tuple contains the n-th elements from all sequences. The iteration stops when the shortest sequence is exhausted.
Example code:
names = ['John', 'Anna', 'Peter'] ages = [28, 22, 35] grouped = list(zip(names, ages)) print(grouped) # [('John', 28), ('Anna', 22), ('Peter', 35)]
Key features:
What happens if you pass sequences of different lengths to zip?
The result will be the length of the shortest sequence. The remaining elements will be ignored.
zip([1,2,3], ['a','b']) # [(1,'a'), (2,'b')]
Can you "unzip" zip? How to get back the original sequences?
Yes, using the unpacking asterisk and zip(*iterator):
pairs = [(1, "a"), (2, "b")] numbers, letters = zip(*pairs) print(numbers) # (1, 2)
How does zip differ from itertools.zip_longest?
zip_longest from itertools works until the longest sequence, filling in the gaps with the specified fillvalue.
from itertools import zip_longest zip_longest([1,2], ['a','b','c'], fillvalue=None) # [(1,'a'), (2,'b'), (None, 'c')]
We processed pairs of users and passwords via zip, but the lists turned out to be of different lengths. Several users were not accounted for due to zip working on the shorter list.
Pros:
Cons:
To combine test results and student names, we used zip_longest with fillvalue="n/a", preserving data for all participants.
Pros:
Cons: