In Python, the pickle and json modules are used for serialization (converting objects into a byte sequence or a string for storage/transmission):
Using Pickle for storing/sending data between untrusted parties is dangerous, as arbitrary malicious code can be executed during deserialization. json does not have this drawback.
Example:
import pickle import json # Pickle (binary serialization) data = {'x': 10, 'func': lambda x: x + 1} with open('data.pkl', 'wb') as f: pickle.dump(data, f) # JSON (only simple objects) data = {'x': 10, 'y': [1, 2, 3]} with open('data.json', 'w') as f: json.dump(data, f)
Question: Can pickle be used to serialize and save any Python objects between sessions? Why is this mechanism not recommended for saving user data?
Answer:
No, using pickle indiscriminately is a bad practice. Besides security (loading a "foreign" pickled object can compromise execution), there is the issue of version mismatches in Python or classes—serialized objects may fail to load or behave incorrectly if the class structure has changed.
Example:
# Loading a pickle file, class structure has changed import pickle with open('old_version.pkl', 'rb') as f: obj = pickle.load(f) # AttributeError or structure mismatch
History
In a large project, pickle was used to store user profiles. After updating Python and changing classes, the structure of serialized objects lost compatibility, leading to system failure and data loss for most users.
In a web service, pickle was used for user sessions. A malicious user uploaded a malicious pickled object, allowing code injection on the server.
An attempt to serialize functions via pickle for transmission over the network failed in several environments: pickled lambdas cannot be transferred between machines with different configurations/versions of Python.