Answer to the question.

History of the question

The UnwindSafe trait was introduced in Rust 1.9 alongside std::panic::catch_unwind to address exception safety concerns inherited from C++ and other languages with exception handling. In Rust, panics trigger stack unwinding that guarantees Drop implementations run, but this does not automatically ensure that data structures remain in consistent states if a panic interrupts a logical operation. The trait was designed to mark types that tolerate being in an active state across a catch_unwind boundary without risking undefined behavior or logic errors.

The problem

When a mutable reference (&mut T) crosses a catch_unwind boundary, and T contains interior mutability (such as RefCell or Cell), a panic can leave T in a logically inconsistent state. For example, if a panic occurs between RefCell::borrow_mut and the implicit drop of the resulting RefMut guard, the RefCell's internal borrow count remains incremented. After catch_unwind captures the panic and execution resumes, the RefCell appears mutably borrowed, yet the guard that would decrement the count has been dropped during unwinding. This "poisoned" state constitutes an exception-safety violation because subsequent operations on the RefCell will panic or behave incorrectly, effectively corrupting the program state in a way that safe code cannot detect or recover from.

The solution

UnwindSafe serves as a conservative marker trait: it is automatically implemented for most types but explicitly opted out for &mut T and any aggregate containing it. By forbidding &mut T from implementing UnwindSafe, the type system prevents passing mutable references into catch_unwind unless the programmer explicitly wraps them in AssertUnwindSafe. This wrapper is an unsafe contract where the programmer asserts that the wrapped type either lacks interior mutability or that they have manually verified exception safety. This architectural choice forces an explicit opt-in to a potentially hazardous pattern, ensuring that accidental exposure of mutable, interior-mutable state across panic boundaries is caught at compile time.

use std::panic::{catch_unwind, AssertUnwindSafe};
use std::cell::RefCell;

fn main() {
    let shared = RefCell::new(vec![1, 2, 3]);
    
    // This fails to compile because &mut RefCell is not UnwindSafe:
    // let _ = catch_unwind(|| {
    //     let mut borrow = shared.borrow_mut();
    //     borrow.push(4);
    //     panic!("interrupted");
    // });
    
    // Explicit opt-in with unsafe acknowledgment:
    let result = catch_unwind(AssertUnwindSafe(|| {
        let mut borrow = shared.borrow_mut();
        borrow.push(4);
        panic!("interrupted");
    }));
    
    // After the panic, shared might be in an invalid borrow state,
    // but we explicitly acknowledged this risk with AssertUnwindSafe.
    println!("Recovered: {:?}", result.is_err());
}

Situation from life

Problem description

A high-performance HTTP server built with hyper needs to isolate panics in user-defined request handlers to prevent a single malformed request from terminating the entire process. The server maintains a connection pool using RefCell (for single-threaded performance) to track active database connections per thread. The architecture wraps each request handler in catch_unwind to capture panics and log them gracefully. During load testing, the server encounters a panic in a handler that holds a mutable borrow of the connection pool's RefCell. When catch_unwind captures the panic, the pool's internal borrow flag remains set to "mutably borrowed" because the RefMut guard was dropped during unwinding without executing its decrement logic. Subsequent requests on the same thread attempt to borrow the pool, triggering a runtime panic due to the already-borrowed state, effectively crashing the thread and losing the pool state.

Solution 1: Eliminate catch_unwind and allow process termination

This approach removes the exception safety issue entirely by letting the process crash on any panic, accepting that availability is secondary to correctness in this specific context.

Pros: Completely eliminates exception safety concerns; no risk of state corruption; simple to implement.

Cons: Unacceptable for production availability; one malicious or buggy request terminates the entire service; violates reliability requirements.

Solution 2: Replace RefCell with Mutex and utilize poisoning

Replace the RefCell-based pool with Mutex<Pool> and leverage Rust's mutex poisoning detection.

Pros: Mutex detects panics in holding threads and marks itself poisoned, allowing subsequent lock attempts to detect corruption via PoisonError; standard library provides built-in safety.

Cons: Mutex introduces synchronization overhead unnecessary for single-threaded async executors; requires restructuring the connection pool to be Send; poisoning requires explicit handling logic to reinitialize the pool.

Solution 3: Wrap handlers in AssertUnwindSafe with state validation

Keep RefCell for performance but wrap the handler in AssertUnwindSafe and implement a custom drop guard that resets the RefCell state if a panic occurs.

Pros: Retains the performance benefits of RefCell; allows panic isolation; possible to implement recovery logic.

Cons: Requires unsafe code to interact with AssertUnwindSafe; extremely difficult to guarantee exception safety for all code paths; easy to miss edge cases where state remains corrupted.

Chosen solution and reasoning

The team selected Solution 2 (Mutex with poisoning) for the shared connection pool, while using Solution 3 only for request-specific temporary buffers that can be trivially reinitialized. The explicit poisoning mechanism of Mutex provides a reliable, standardized way to detect corruption without requiring unsafe auditing of every possible panic point. The minor performance overhead was accepted in exchange for the safety guarantee.

Result

The server successfully isolates panics in request handlers without risking state corruption. When a handler panics while holding the pool lock, the mutex is poisoned, and the server detects this on the next access, discarding the corrupted thread-local pool and spawning a fresh one. This ensures that no undefined behavior occurs and that the service remains available even under adversarial inputs.

What candidates often miss

Why does catch_unwind require UnwindSafe even though Rust runs destructors during panics?

Many candidates assume that because Drop implementations run during unwinding, exception safety is guaranteed. However, UnwindSafe addresses the logical state of data, not just resource leaks. A panic can interrupt a sequence of operations (like updating a length field before the corresponding data), leaving an object in a temporarily inconsistent state. The destructor runs on this broken state, potentially propagating corruption. UnwindSafe ensures that either the type cannot be broken by interruption (immutable data) or that the programmer acknowledges the risk. It prevents resuming execution with objects that violate their own invariants.

What is the difference between UnwindSafe and the Send/Sync auto-traits?

While Send and Sync are also auto-traits, they use positive reasoning: &T is Send if T is Sync, and &mut T is Send if T is Send. UnwindSafe uses negative reasoning: &mut T is never UnwindSafe, regardless of T. Additionally, AssertUnwindSafe acts as a value-level escape hatch (similar to unsafe impl but for specific values), whereas Send/Sync violations typically require unsafe impl at the type level. UnwindSafe also pairs with RefUnwindSafe for shared references, creating a dual-trait system similar to but distinct from Send/Sync.

How does RefCell's borrow flag create unsafety with panics, and why doesn't Mutex have the same UnwindSafe issues?

RefCell relies on a runtime borrow flag. If a panic occurs between borrow_mut() and the guard's Drop, the flag remains set, but the guard is gone. When execution resumes, the RefCell appears borrowed, but no borrow actually exists. This is a logic error that causes future borrows to panic erroneously. Mutex avoids this by implementing poisoning: if a panic occurs while a lock is held, the Mutex marks itself poisoned. Subsequent lock() calls return an error indicating the previous thread panicked. This makes the corruption explicit and detectable, whereas RefCell's corruption is silent. Therefore, MutexGuard is actually !UnwindSafe, but the poisoning mechanism provides a safe recovery path that RefCell lacks.

Evaluate the architectural rationale behind the UnwindSafe auto-trait's conservative opt-out semantics for mutable references, and explain how this prevents exception-safety violations when combining catch_unwind with interior mutability.

Answer to the question.

Situation from life

What candidates often miss