Answer to the question

When an async future is dropped while suspended at an await point (such as when a sibling branch completes in tokio::select!), its Drop implementation runs synchronously to destroy held resources. The hazard arises when the future owns resources requiring asynchronous cleanup—such as flushing a TcpStream, sending a protocol close frame, or committing a database transaction—because the Drop trait provides no async context. If the future is cancelled after partially modifying state (e.g., writing half a file buffer) but before finalizing, the synchronous Drop cannot .await the completion of cleanup operations, potentially leaving the system in an inconsistent state or leaking resources. The architectural solution involves the drop-guard pattern: wrapping the resource in a guard struct whose Drop implementation either schedules a synchronous fallback cleanup (accepting blocking risks) or transitions the resource into a detached cleanup task, ensuring that the critical invariant (e.g., temporary file deletion) is eventually enforced without relying on async code within the destructor.

Situation from life

We developed a high-throughput media ingestion service where tokio::spawn handled concurrent file uploads. Each upload task wrote chunks to a temporary file on disk, performed virus scanning via an external process, and finally atomically moved the validated file to a permanent storage bucket. The requirement was strict: if the client disconnected (triggering task cancellation via select! between the virus scan and the atomic move), the temporary file had to be deleted immediately to prevent disk space exhaustion.

Solution 1: Synchronous cleanup in Drop. We implemented a TempFileGuard struct wrapping std::fs::File and the path string. In its Drop implementation, we invoked std::fs::remove_file synchronously to delete the temporary file. Pros: The code was straightforward and guaranteed execution during stack unwinding or cancellation. Cons: std::fs::remove_file is a blocking syscall. When running on the Tokio runtime's worker threads, this blocked the thread for milliseconds under high disk load, starving other tasks and violating the async non-blocking contract. Furthermore, if the temporary file was on a network filesystem (NFS), the block could extend to seconds, causing catastrophic latency bubbles.

Solution 2: Spawned cleanup task. In the guard's Drop, we captured the path string and spawned a detached tokio::task to run tokio::fs::remove_file asynchronously. Pros: This returned control to the runtime immediately, preserving latency. Cons: If the runtime was already shutting down or under extreme load, the cleanup task might never execute, leading to resource leaks. Additionally, this pattern required the guard to hold a Clone handle to the runtime, complicating the struct's lifetime and introducing potential use-after-free if the runtime dropped before the guard.

Solution 3: Explicit cancellation token with synchronous fallback. We utilized tokio_util::sync::CancellationToken and structured the upload logic to check for cancellation before the atomic move. If cancelled, a synchronous delete was attempted only if the file was below a certain size threshold (fast delete), otherwise it was queued to a dedicated background cleanup thread (spawned via std::thread) with a channel. The guard's Drop only handled the rare edge case of a panic, using synchronous deletion as a last resort. Chosen solution: We selected Option 3. It balanced determinism (synchronous path for small files) with scalability (background thread for slow operations) while avoiding blocking the Tokio workers. The result was zero leaked temporary files during load testing with 10,000 concurrent cancellations, and p99 latency remained stable because the background thread absorbed the NFS latency penalty.

What candidates often miss

Why is invoking block_on inside a Drop implementation to perform async cleanup fundamentally unsound in most async runtimes?

Attempting to call block_on within Drop creates a reentrancy hazard. Drop is invoked synchronously during stack unwinding or when a future is cancelled. If the current thread is a worker thread of the Tokio (or async-std) runtime, block_on will attempt to drive the reactor to completion for the new future. However, the runtime is already waiting for the current task (the one being dropped) to release the thread. This leads to a deadlock: block_on waits for the reactor to poll the cleanup future, but the reactor cannot make progress because the thread is blocked inside block_on. Additionally, runtimes like Tokio explicitly panic when detecting nested block_on calls to prevent this scenario. The correct approach is to perform cleanup synchronously (if instantaneous) or offload to a dedicated thread via a channel, never blocking the async executor from within a destructor.

How does the design of the Future::poll method inherently restrict cancellation to occur only at await points, and why is this significant for critical section design?

The Future::poll method is synchronous and must return Poll::Ready or Poll::Pending promptly; it cannot yield mid-execution. An await point is syntactic sugar for the compiler-generated state machine transitioning between states when poll returns Pending. The executor (or the select! macro) can only drop the future when it is not actively executing—specifically, when it has returned Pending and yielded control. Consequently, cancellation is atomic with respect to poll invocations. This is significant because it guarantees that any code between two await points (a "critical section") executes entirely or not at all from the perspective of the async runtime. However, if a future holds a MutexGuard across an await (which Rust forbids for standard Mutex but permits for tokio::sync::Mutex), cancellation could leave shared data in an inconsistent state. Candidates often miss that they must ensure data structure invariants are restored before each await point, not just at the end of the function, because cancellation runs Drop on all live variables exactly at that suspension point.

In the context of std::pin::Pin, why must futures used in select! be either Unpin or explicitly pinned, and how does this prevent memory unsafety during partial dropping?

select! randomly polls multiple futures. If a future is !Unpin (e.g., it contains self-referential pointers or intrusive list links), moving it after the first poll would invalidate those pointers. Pin guarantees that the memory location of the future remains stable. select! requires futures to be Unpin (allowing moves) or already Pin-ned to a specific memory location (stack or heap). When a branch completes, select! drops the other futures. If the future was Unpin, it is moved into the drop glue. If it was Pin-ned, it is dropped in place. The memory safety guarantee stems from Pin ensuring that drop is called on the future at its original memory address, preventing the use-after-free or dangling pointer issues that would arise if a self-referential future were moved (even for destruction) after being polled. Candidates frequently overlook that Pin affects not just polling but also the destruction semantics of cancelled futures.