History of the question
During the genesis of Rust, the designers confronted a critical impasse: essential data structures like cyclic graphs and runtime-borrow-checked containers required mutation through shared references, yet this directly contravened the language’s foundational axiom of exclusive mutable access. To resolve this without compromising the zero-cost abstraction principle, UnsafeCell was introduced as the sole primitive that opts out of the immutability guarantee associated with shared references &T, serving as the bedrock for all safe interior mutability abstractions.
The problem
The Rust compiler leverages the immutability of &T to perform aggressive optimizations, such as value caching and instruction reordering, assuming the underlying memory cannot change for the reference’s lifetime. UnsafeCell signals to the compiler that its contents may mutate even when accessed through a shared reference, effectively disabling these optimizations for the enclosed data. However, this opt-out does not extend to the references derived from the raw pointer obtained via UnsafeCell::get(); the moment this pointer is converted to &mut T, the standard aliasing rules reassert with absolute rigidity.
The solution
The solution requires the programmer to uphold the invariant that any mutable reference &mut T produced from UnsafeCell's raw pointer must be the only active access path to that memory for its entire lifetime. This exclusivity forbids concurrent reads or writes through any other pointer, reference, or subsequent calls to get() during the mutable reference’s existence. UnsafeCell does not disable the borrow checker; it merely transfers the responsibility of guaranteeing temporal exclusivity and preventing data races from the compiler to the developer.
Problem description
We were architecting a high-throughput metrics aggregator for a low-latency trading system where multiple threads updated counters associated with specific financial instruments. The shared map was immutable after initialization, but the metric values required frequent increments. Employing Mutex<u64> introduced unacceptable contention, while AtomicU64 proved insufficient for complex composite metric types. We required lock-free, zero-allocation updates to structs behind Arc pointers without runtime borrowing checks.
Different solutions considered
Solution 1: Sharded Mutexes
We evaluated wrapping each metric in a Mutex and distributing them across 256 shards to reduce contention. This approach offered straightforward safety and simple, maintainable code. However, profiling revealed that even uncontended Mutex operations consumed hundreds of nanoseconds due to futex syscalls and cache coherency protocols, violating our strict sub-microsecond latency budget.
Solution 2: AtomicPtr with Boxed Values
Another approach involved storing values as AtomicPtr<Metric> and utilizing compare-and-swap loops for updates. This eliminated blocking but necessitated allocating new Box instances for every increment, leading to severe memory pressure and allocator contention. Furthermore, it complicated memory reclamation, requiring hazard pointers or epoch-based garbage collection that significantly increased code complexity and audit surface area.
Solution 3: UnsafeCell with Cache-Line Alignment
We chose to store metrics in UnsafeCell<Metric> within cache-line-aligned structs, ensuring threads writing to different shards never shared cache lines. Each thread obtained a raw pointer via UnsafeCell::get(), cast it to &mut Metric during the update—guaranteed safe by our sharding logic ensuring no other thread could access that specific slot—and performed the mutation. This required unsafe blocks and a formal proof that our consistent hashing ensured no collision during concurrent access.
Which solution was chosen and why
We selected Solution 3 because it provided zero-cost abstraction over raw memory while meeting the aggressive latency requirements. The sharding guarantee acted as a manual proof of exclusive access, allowing us to leverage UnsafeCell without runtime synchronization overhead. We validated the safety using MIRI and the loom concurrency model checker to exhaustively verify no aliasing violations occurred under all possible thread interleavings.
Result
The implementation achieved sub-100 nanosecond update latencies with zero allocations in the hot path. However, a subtle regression emerged during a subsequent refactoring where a maintenance task accidentally iterated over all shards without acquiring the implicit shard-lock, creating two mutable references to the same metric. MIRI immediately flagged this as undefined behavior during CI, reinforcing that UnsafeCell demands rigorous discipline even when the architectural design theoretically guarantees safety.
Why is it undefined behavior to hold two mutable references derived from an UnsafeCell simultaneously, even though UnsafeCell explicitly opts out of standard borrowing rules?
UnsafeCell opts out of the immutability guarantee for shared references at the type level, but it does not relax the fundamental invariant of the &mut T type itself. When you call get(), you receive a raw pointer *mut T that carries no lifetime or aliasing constraints. However, the instant you dereference this pointer into a &mut T, you assert to the compiler that this reference is exclusive. Creating two such references to overlapping memory, even from the same UnsafeCell, violates the aliasing XOR mutation rule that underpins Rust's memory model, leading to immediate undefined behavior regardless of how the references were constructed.
How does MIRI detect violations of UnsafeCell invariants, and why might code pass production tests but fail under MIRI?
MIRI implements the Stacked Borrows (or optionally Tree Borrows) aliasing model, which tracks memory access permissions through abstract "tags." When you create a reference from an UnsafeCell, MIRI assigns a unique tag. Any attempt to use a different tag to access the same memory while the first reference is active constitutes a violation. Code often passes standard tests because hardware memory models are forgiving, and benign data races may not manifest as crashes in practice. MIRI, however, rigorously enforces the theoretical model, catching transgressions like invalidating a mutable reference by creating a shared reference from the same UnsafeCell without proper synchronization, even if the assembly happens to work on the current CPU architecture.
Explain why Cell<T> does not require unsafe blocks for mutation while UnsafeCell<T> does, and identify the specific safety guarantee that enables this distinction.
Cell<T> achieves interior mutability without unsafe by never exposing references to its interior data; it only permits copying values in (set) or out (get) for types implementing Copy, or moving them (replace) for non-Copy types. Because Cell never yields a &T or &mut T to the contained value, it is impossible to violate aliasing rules—there are no references to alias. UnsafeCell, conversely, provides get() which returns a raw pointer *mut T, allowing the creation of references. This flexibility is necessary for complex in-place mutations, but it shifts the burden of ensuring exclusivity and preventing data races entirely onto the programmer, necessitating unsafe blocks.