In Python, variable scope resolution is performed statically during the compilation phase rather than dynamically during execution. When the CPython compiler encounters a function definition, it traverses the abstract syntax tree to build a symbol table that categorizes every name as either local, global, or cell variable. If the compiler detects any binding operation—such as assignment, augmented assignment, or import—for a name anywhere within the function body, it marks that name as a local variable for the entire scope. This design enables the virtual machine to use optimized LOAD_FAST opcodes that operate on a fixed-size array rather than performing slower hash table lookups. This optimization is fundamental to Python's function call performance but introduces strict binding requirements.
When a name is classified as local, the compiler emits LOAD_FAST bytecode instructions for all read operations of that name. During runtime, LOAD_FAST attempts to retrieve the object reference from the corresponding index in the frame's local variable array. If the slot contains a null pointer indicating no value has been assigned yet, the runtime raises UnboundLocalError. This occurs even if a global variable with the identical name exists, because the compiler deliberately avoided emitting LOAD_GLOBAL. The error explicitly indicates this static scoping decision, distinguishing it from NameError.
To resolve this, you must explicitly inform the compiler that the name refers to the global namespace by declaring global <variable_name>. This declaration causes the compiler to switch to LOAD_GLOBAL and STORE_GLOBAL opcodes, which dynamically look up the name in the module's global dictionary. Alternatively, restructure the code to ensure all local variables are initialized at the top of the function before any conditional logic reads them. For nested scopes, the nonlocal keyword forces the compiler to use LOAD_DEREF to access closure cells. These declarations alter the compiler's binding decision at compile time, preventing the unbound local scenario.
threshold = 100 def analyze(data): # Compiler sees 'threshold = ...' below, marks it as local if data > threshold: # Raises UnboundLocalError return "high" threshold = 50 # Assignment makes it local # Solution using 'global' def analyze_fixed(data): global threshold if data > threshold: # LOAD_GLOBAL succeeds return "high" threshold = 50 # Updates global variable
A data engineering team was building an ETL pipeline using Apache Airflow. They defined a default configuration dictionary CONFIG = {"batch_size": 1000} at the module level to allow easy adjustment of processing parameters. The main transformation function process_batch() initially checked if len(records) > CONFIG["batch_size"]: to determine if splitting was necessary. Later in the function, under a specific condition, the code attempted to optimize memory by reducing the batch size with CONFIG = {"batch_size": 500}. This pattern inadvertently triggered a scope conflict.
When the pipeline executed, it crashed on the first line of the function with UnboundLocalError: local variable 'CONFIG' referenced before assignment. The assignment statement at the end of the function caused the Python compiler to treat CONFIG as a local variable for the entire function body. Consequently, the comparison operation at the start used LOAD_FAST to access the uninitialized local variable slot. This failure halted the data pipeline during a critical production run because the function could not begin execution.
The team first considered renaming the local reassignment to local_config, creating a new dictionary for the reduced batch processing. This would avoid the shadowing issue entirely and keep the global configuration immutable. However, this approach required refactoring downstream code that expected the name CONFIG to reflect current limits. It introduced potential inconsistencies if the developer forgot to use the new variable name in subsequent logic. The cognitive overhead of tracking two variable names for the same concept made this solution less attractive.
Another option was to add global CONFIG at the start of the function, forcing the compiler to treat all references as global lookups. While this would prevent the error, the team rejected it because modifying global state during a batch process is a dangerous anti-pattern. It prevents function reentrancy and complicates unit testing significantly. Additionally, it would create race conditions if the code were ever parallelized across threads. The side effects on module-level state were deemed unacceptable for production data pipelines.
The third solution involved mutating the existing dictionary in place using CONFIG["batch_size"] = 500 rather than reassigning the variable name itself. Since this operation does not create a new binding for the name CONFIG, the compiler continues to treat it as a global reference. This avoids UnboundLocalError while allowing the configuration update to persist for subsequent calls. This was deemed the best immediate fix, though the team planned to refactor the configuration into a class instance later. The mutation approach preserved the existing API while resolving the immediate crash.
They implemented the third solution, changing the reassignment to a mutation CONFIG["batch_size"] = 500. The pipeline resumed execution without errors, and the configuration change applied correctly to subsequent batches. Later, they refactored the code to use a Pydantic settings object injected into the function. This completely removed the dependency on module-level global variables and made the function pure and testable. The incident prompted a code review of all Airflow operators to eliminate similar shadowing patterns.
Why does del a variable inside a function, followed by an attempt to read it, raise UnboundLocalError instead of falling back to the global scope?
When you execute del x on a local variable, it removes the reference from the frame's f_locals but does not change the static classification of x as local. The compiler still generated LOAD_FAST for subsequent reads. When the interpreter executes LOAD_FAST, it finds the slot empty and raises UnboundLocalError rather than falling back to globals. This confirms scope decisions are immutable at runtime. To access a global x after deletion, you must declare global x at compile time.
How do default argument expressions avoid the UnboundLocalError trap, and what does this reveal about their evaluation timing?
Default arguments are evaluated once when the function definition executes in the enclosing scope, not inside the function's local scope. If you write def f(val=CONFIG["key"]):, Python uses LOAD_GLOBAL to resolve CONFIG at definition time. Even if the function body later assigns to CONFIG, making it local, the default was already captured safely. This reveals that default values use the global scope at definition time, separate from the function body's local execution. Thus, defaults avoid the UnboundLocalError that would occur if the same access happened inside the function body before assignment.
Why does UnboundLocalError never occur in class bodies, and what bytecode difference enables this?
Class bodies use LOAD_NAME instead of LOAD_FAST for variable access. LOAD_NAME performs a dynamic lookup in the class dict, then global dict, then builtins. It does not use a pre-allocated fixed slot, so it never encounters an "unbound local" state. If a name is referenced before assignment in a class body, LOAD_NAME simply proceeds to find it in the global scope. This dictionary-based approach trades the speed of function locals for the flexibility needed during class construction.