Answer to the question.

Go employs a tri-color concurrent garbage collector where objects transition from white (unmarked) to grey (queued) to black (fully scanned). The fundamental invariant during marking is that black objects must never contain pointers to white objects, as this would allow the collector to mistakenly free reachable memory. To enforce this without stopping the world, Go uses a write barrier—a compiler-inserted hook triggered on every pointer write to the heap. When a mutator goroutine executes a pointer write, the barrier checks if the target object is white; if so, it immediately shades the target grey before completing the write, atomically preserving the invariant.

Situation from life

We observed severe tail latency in a real-time analytics pipeline processing millions of events per second. The system used a complex graph structure where nodes frequently updated references to child nodes based on streaming data, causing massive pointer churn during Go's GC cycles.

First solution considered: We attempted to mitigate this by increasing GOGC to 200% to delay collections. Pros: Reduced the frequency of GC cycles, lowering total barrier execution count over time. Cons: This dramatically increased peak heap size, risking OOM crashes on our memory-constrained containers, and merely deferred the latency spikes rather than resolving them.

Second solution considered: We experimented with object pooling using sync.Pool to reuse node structs and reduce allocations. Pros: Decreased allocation pressure and the rate of new white objects being created. Cons: The write barrier overhead remained high because we were still mutating pointers within existing (often already-scanned) black objects at the same rate; pooling did not address the cost of barrier execution on pointer updates.

Third solution considered: We refactored the graph to use integer indices into a large slice rather than direct pointers for node relationships. Pros: Integer assignments are not pointer writes, completely bypassing the write barrier mechanism and eliminating the associated CPU cost during marking. Cons: This required implementing manual memory management for the slice (handling holes, compaction) and made the code less idiomatic and harder to maintain.

Chosen solution: We adopted the index-based approach for the high-churn core graph, while retaining pointers for static metadata. This directly eliminated the write barrier hot path while preserving graph connectivity semantics.

Result: Tail latency during GC dropped by 90%, from 15ms to 1.5ms, and overall throughput increased by 40% due to reduced GC assist work stealing CPU from mutators.

What candidates often miss

Why does the write barrier shade the object being pointed to rather than the object being modified?

Candidates often incorrectly assume the barrier should mark the source object (the one being written to) as needing re-scanning. However, the source is already either grey or black; if it is black, re-scanning it would be expensive and require tracking all its outgoing pointers. By contrast, shading the target (the new pointer value) grey immediately satisfies the tri-color invariant: if the source is black and the target was white, the edge becomes black-to-grey, which is safe. This distinction is crucial because it minimizes work (only the new target is queued) rather than requiring potentially large source objects to be rescanned.

How does the write barrier interact with stack allocations, and why might stacks need re-scanning?

While write barriers primarily intercept heap pointer writes, Go must also handle pointers from stacks to the heap. If a goroutine writes a pointer to a white heap object into a black stack frame, the write barrier executes to shade the target. However, because stacks can grow, shrink, and be copied, maintaining precise black/white states for every stack slot is complex. Go solves this by treating stacks as roots that may need re-scanning at the end of the mark phase if they were active during marking. Candidates frequently miss that stack re-scanning is a necessary fallback when write barriers on stacks cannot guarantee the invariant due to concurrent execution, and that this final stop-the-world phase is usually brief but essential for correctness.

What is the difference between the Dijkstra write barrier and the Yuasa write barrier, and which does Go use?

The Dijkstra barrier shades the target object when a pointer is installed (black mutator, white target), preventing the black-to-white edge from ever existing. The Yuasa barrier, conversely, records the old pointer value being overwritten and shades that, preserving the "snapshot-at-the-beginning" property. Go uses a hybrid Dijkstra barrier because it is simpler and ensures the strong tri-color invariant immediately, though it can cause floating garbage if a white object becomes unreachable immediately after being shaded. Candidates often conflate these or believe Go uses Yuasa because of its conservative stack handling, but understanding the Dijkstra choice explains why Go's barrier is synchronous with the write rather than logging-based.

How does **Go**'s write barrier prevent the loss of reachable objects during concurrent garbage collection when a goroutine writes a pointer to a white object into a black object?

Answer to the question.

Situation from life

What candidates often miss

How does Go's write barrier prevent the loss of reachable objects during concurrent garbage collection when a goroutine writes a pointer to a white object into a black object?