GoProgrammingSenior Go Developer

What underlying technique allows **Go** to relocate a goroutine's entire stack to a new memory location while preserving the validity of all pointers referencing stack-allocated data?

Pass interviews with Hintsage AI assistant

Answer to the question

History

Early Go implementations allocated fixed-size stacks (1KB per goroutine), which either exhausted memory with high concurrency or overflowed during deep recursion. The language evolved from segmented stacks (linked chunks) in early versions to contiguous stack copying in Go 1.3+ to improve cache locality and simplify pointer management.

Problem

When a goroutine exhausts its current stack segment, the runtime must allocate a larger memory region and relocate all existing stack data. This relocation risks invalidating pointers that reference stack variables, as their memory addresses change during the move, potentially causing memory corruption or crashes.

Solution

The compiler inserts a stack-check preamble at every function entry, comparing the stack pointer against the guard page. If space is insufficient, it calls runtime.morestack, which allocates a new stack (typically doubling the size), copies the old content, and uses compiler-generated pointer bitmaps to find and adjust all pointers within the stack that point to other stack locations.

Code Example

The following function demonstrates how pointers to stack variables remain valid even when the stack grows during recursion:

func Calculate(depth int, prev *int) int { if depth == 0 { return *prev } // current is allocated on the stack current := depth * 100 // &current may point to old stack location // If stack grows here, runtime updates the pointer return Calculate(depth-1, &current) + *prev }

Execution resumes on the new stack with updated registers, ensuring all pointers reference the correct new addresses.

Situation from life

Scenario

A financial matching engine processing recursive order book calculations encountered sporadic crashes during high-volatility market events when recursion depth exceeded the initial 2KB stack allocation. The system required a solution that maintained the clarity of recursive algorithms without compromising the millions of lightweight goroutines handling concurrent connections.

Problem

The matching algorithm used deep recursion to traverse tree-shaped order depth, causing stack overflow panics precisely when trading volume peaked. The solution needed to handle unbounded recursion safely without wasting gigabytes of memory on pre-allocated large stacks for mostly-idle goroutines.

Solution 1: Fixed Large Stacks

Pre-allocate large stacks for all goroutines using debug.SetMaxStack or modifying the runtime defaults. Pros: Completely eliminates growth overhead and overflow risk. Cons: Consumes excessive memory for idle connection handlers, violating the lightweight goroutine promise and reducing maximum feasible concurrency.

Solution 2: Iterative Conversion

Rewrite the recursive tree traversal as an iterative algorithm with an explicit heap-allocated stack slice to track traversal state. Pros: Predictable memory usage and no risk of stack overflow. Cons: Increased code complexity, loss of algorithmic clarity, and additional garbage collection pressure from frequent slice allocations during high-volume trading.

Solution 3: Dynamic Stack Growth

Retain the recursive design but rely on Go's contiguous stack growth, ensuring the compiler optimizes function frames with accurate pointer maps. Pros: Maintains clean recursive logic, uses memory proportional to actual need, and handles traffic spikes automatically without code changes. Cons: Microsecond pauses during stack copying, though these are mitigated by small default stacks and efficient copying.

Chosen Approach

Solution 3 was selected because the 100-nanosecond overhead of stack copying proved negligible compared to network latency, and it preserved the mathematical clarity of the recursive matching algorithm. We added recursion depth limits as a safety guardrail to prevent infinite loops from consuming 1GB stacks.

Result

The system sustained 50,000 concurrent recursive calculations during market stress tests without crashes. Memory utilization remained under 300MB for 100,000 goroutines, and p99 latency increased by less than 2 microseconds during stack growth events, meeting strict high-frequency trading requirements.

What candidates often miss

Why doesn't stack copying break pointers to stack variables when the stack moves to a new address in memory?

The runtime relies on stack maps (bitmaps) generated by the compiler for every function. These maps identify which slots in the stack frame contain pointers. During runtime.copystack, the runtime iterates through these maps, finds every pointer pointing to the old stack range, and updates it to the corresponding offset in the new stack. This ensures that even after the physical memory address changes, all references remain valid and point to the correct new locations.

How does Go handle stack growth during CGO calls that might hold pointers to Go stack data?

CGO execution always switches to the system stack (g0) before entering C code. The runtime ensures no goroutine stack pointers are exposed to C functions. If stack growth occurs while C code executes (via a separate goroutine), the C stack remains unaffected. When returning from C to Go, the runtime switches back to the (potentially moved) goroutine stack using the updated stack pointer saved during the runtime.entersyscall transition.

What causes the fatal error "runtime: goroutine stack exceeds 1000000000-byte limit" and how does it differ from normal growth?

Unlike regular stack expansion which copies to a larger contiguous region, this error occurs when runtime.morestack detects that the requested growth would exceed the hard limit (1GB on 64-bit systems). This indicates unbounded recursion or runaway allocation. While normal growth is transparent and copy-based, hitting this limit triggers an immediate panic because the runtime cannot satisfy the memory request without risking system OOM, and continuing execution would be unsafe.