History of the question
Prior to C++20, the C++ standard provided no thread-safe facility for formatted output to shared streams without manual synchronization. While std::cout was guaranteed to synchronize with C streams via sync_with_stdio, this prevented buffering and caused severe performance degradation. Developers typically wrapped std::mutex around every insertion, but this serialized threads entirely and failed to protect against interleaving if the lock was forgotten in even one code path. The need for a higher-level abstraction that batches output atomically became critical for high-throughput multithreaded logging.
The problem
Standard stream insertion operations are not atomic. A single operator<< invocation for a complex type may trigger multiple sputc or sputn calls to the underlying std::streambuf, and concurrent threads can have their characters interleaved at this granular level. Existing workarounds like std::stringstream followed by locked output required an extra copy of the entire buffer. Manual locking added boilerplate and risked deadlocks if exceptions occurred before mutex unlock.
The solution
C++20 introduced std::basic_osyncstream (aliased as std::osyncstream for char), which encapsulates a std::basic_syncbuf. This internal buffer accumulates all formatted output locally. When emit() is invoked—either explicitly or by the destructor—it acquires a mutex associated with the wrapped std::streambuf and transfers the entire accumulated content in a single contiguous write. This transforms fine-grained character-level locking into coarse-grained message-level locking, ensuring that no other thread can interleave characters during the emission.
#include <syncstream> #include <iostream> #include <thread> #include <vector> void worker(int id) { std::osyncstream synced_out(std::cout); synced_out << "Thread " << id << " processing data "; // emit() called automatically here, atomically writing the full line } int main() { std::vector<std::jthread> threads; for (int i = 0; i < 10; ++i) { threads.emplace_back(worker, i); } }
A distributed database system needed to write JSON commit records to a central audit log from multiple transaction threads. Each record contained transaction IDs, timestamps, and status flags. Without atomic emission, braces and quotes from different threads mixed, producing corrupted JSON that downstream analytics pipelines could not parse, causing nightly batch jobs to fail.
One solution considered was a global std::shared_mutex allowing concurrent reads but exclusive writes. Pros: Familiar synchronization pattern. Cons: Writers were still serialized for the entire duration of JSON formatting; high contention during commit storms caused latency spikes; deadlock risks existed if a thread holding the lock threw an exception before unlock.
Another approach considered per-thread log files merged by a background thread. Pros: Zero contention on the write path; no locking required during transaction processing. Cons: Complex log rotation and file management; loss of temporal ordering across threads; increased disk I/O from multiple file handles; potential for lost logs if the merger thread crashed.
The team adopted std::osyncstream. Each transaction scope creates a local std::osyncstream wrapping the shared audit std::ofstream. The JSON is built in the internal buffer and atomically emitted at scope exit. This reduced lock hold time from milliseconds (JSON formatting duration) to microseconds (buffer copy), eliminated corruption entirely, and preserved chronological order since emit() serializes access to the underlying file buffer.
Result: The system sustained over 100,000 commits per second without log corruption, and debugging became feasible because records remained intact and ordered.
Why does std::osyncstream's destructor call emit() unconditionally, and what are the exception safety implications if the underlying stream throws?
The destructor ensures no buffered data is lost by invoking emit(). If emit() throws (e.g., disk full), and since destructors are implicitly noexcept, std::terminate is called immediately. Candidates often believe the exception propagates or that the buffer is silently discarded. The correct detail is that std::basic_syncbuf provides an emit_on_flush flag and error state accessible via get_wrapped(), but the destructor prioritizes program termination over silent data loss or exception leakage from destructors.
How do explicit flush operations (std::flush or std::endl) interact with std::osyncstream's internal buffering, and why does misuse revert performance to mutex-like levels?
Many candidates think std::flush immediately writes to the underlying device. In std::osyncstream, std::flush triggers the internal syncbuf to prepare for emission but does not call emit() itself; it only sets the emit_on_flush state. If a user calls emit() manually after every insertion (mimicking unit buffering), they force per-message locking, defeating the batching optimization. The efficiency relies on batching multiple insertions into a single emit() call at scope exit.
What lifetime constraint binds the wrapped std::streambuf to the std::osyncstream, and what undefined behavior occurs if the wrapped buffer is destroyed before emit()?
std::osyncstream stores a pointer to the wrapped std::streambuf, not ownership. If you create a temporary std::ostringstream, pass its rdbuf() to std::osyncstream, and the ostringstream goes out of scope before the osyncstream, subsequent emit() calls dereference a dangling pointer. Candidates often assume reference counting or that the osyncstream copies the buffer. The standard mandates the programmer must ensure the wrapped streambuf outlives the syncstream, similar to how std::string_view depends on underlying string storage.