System ArchitectureSystem Architect

Engineer a planetary-scale, real-time programmatic advertising exchange that orchestrates sub-100ms auction decisions across heterogeneous demand-side platforms, enforces per-campaign budget pacing without distributed locking, detects click-fraud through behavioral fingerprints in the request path, and reconciles billing discrepancies through immutable ledger entries while maintaining regional data sovereignty for GDPR compliance.

Pass interviews with Hintsage AI assistant

Answer to the question

History of the question

Early ad serving relied on static waterfall configurations where publishers prioritized demand partners sequentially, creating latency waterfalls and revenue leakage. The shift to Header Bidding and OpenRTB protocols democratized inventory access but introduced extreme engineering constraints: auctions must complete within 100ms to prevent page abandonment, budget enforcement must prevent overspend across thousands of edge nodes, and fraud detection must occur inline without adding network hops. This question emerged from the need to replace centralized Apache Kafka pipelines with edge-computing architectures capable of making autonomous financial decisions while maintaining strict auditability and data residency requirements.

The problem

Traditional architectures rely on centralized PostgreSQL or Redis clusters for budget counters and feature stores, creating cross-region latency that violates the 100ms SLA during traffic spikes like Black Friday. Naive optimistic locking on budget decrements causes thundering herds and dropped bids, while asynchronous fraud detection allows bots to exhaust campaign budgets before detection triggers. Furthermore, billing reconciliation across DSPs (Demand-Side Platforms) suffers from network partitions where impression pixels fire but acknowledgement messages drop, leading to revenue leakage or duplicate charging that destroys advertiser trust.

The solution

Deploy Envoy Proxy sidecars with WebAssembly filters at edge PoPs (Points of Presence) to run auction logic within 10ms of end users. Implement CRDT (Conflict-free Replicated Data Type) counters using Redis with Gossip protocol for budget pacing, allowing edge nodes to accept bids locally while guaranteeing global budget consistency through eventual convergence. Embed lightweight TensorFlow Lite models within the edge layer for real-time bot detection using behavioral fingerprints such as mouse velocity patterns and navigation entropy. Use Apache Pulsar with Geo-Replication and BookKeeper for immutable audit logs, ensuring exactly-once semantics via Idempotent Producers and Deduplication Windows. For GDPR compliance, implement K-Anonymity checks and data residency routing through Anycast DNS with EDNS Client Subnet awareness.

Situation from life

During the 2023 Black Friday event, our platform experienced a 40x traffic surge that saturated our centralized Redis budget store in us-east-1, causing 12% of auctions to timeout and threatening $2M in potential revenue loss. The engineering team faced a critical architectural decision: maintain strong consistency and accept latency violations, or prioritize speed and risk catastrophic budget overspend.

Solution A: Redis Cluster with Redlock

We considered implementing Redlock algorithms across five independent Redis master nodes to enforce strict budget consistency. This approach would theoretically guarantee that no campaign exceeded its daily cap by requiring majority consensus for each decrement. However, the round-trip time between edge nodes and the Redis cluster averaged 35ms, and under load, lock contention caused 8% of requests to retry multiple times, blowing past our 100ms SLA. While this provided perfect budget accuracy, the unacceptable latency and operational complexity made it unsuitable for real-time bidding.

Solution B: Local In-Memory Caching with Async Sync

Alternatively, we evaluated allowing each edge node to maintain purely local budget counters that synchronized asynchronously to a central ledger every 30 seconds. This achieved sub-5ms auction latency and handled the traffic spike gracefully without external dependencies. Unfortunately, during the surge, multiple edge nodes oversold high-value campaigns by $800K collectively before the sync occurred, causing advertiser trust issues and contractual penalties. The speed was optimal, but the financial risk was catastrophic for a payment-adjacent system.

Solution C: Hybrid CRDT Architecture with Hierarchical Pacing

We implemented a hybrid approach using Delta-State CRDTs in Redis at three tiers: edge, regional, and global. Edge nodes accept bids using local PN-Counters (Positive-Negative Counters) with conservative local thresholds set at 95% of the global budget. When local budgets exhaust, nodes query regional caches with Read-Your-Writes consistency. The remaining 5% buffer is managed by the global ledger using CRDT merges during gossip synchronization. For fraud, we deployed TinyML models on the edge nodes trained to detect bot patterns without network calls. We chose this solution because it provided 99.9% budget accuracy while maintaining 45ms p99 latency.

Result

The platform processed 12 million queries per second during the peak with zero budget overspend on capped campaigns. Fraud detection latency dropped from 150ms to 8ms, blocking 3.4% of malicious traffic before bid submission. The CRDT reconciliation achieved eventual consistency within 200ms across regions, well within the billing reconciliation window, and GDPR compliance was maintained through edge-local data processing.

What candidates often miss

How do you prevent budget overspend when multiple edge nodes concurrently decrement the same campaign budget without acquiring global locks?

Most candidates suggest distributed locks or atomic decrement operations in Redis, which fail under network partitions or high latency. The correct approach uses PN-Counters (Positive-Negative Counters) implemented as CRDTs. Each edge node maintains a local increment counter for spend and a decrement counter for refunds. When a node accepts a bid, it increments its local spend counter. During gossip synchronization, nodes exchange their counter states and merge them using the Join operation (taking the maximum of each counter component). To prevent temporary overspend, implement Token Bucket algorithms locally with conservative caps. If the sum of all local spends approaches the global limit, nodes enter a "thrift mode" where they query the regional coordinator for the remaining 5% of budget. This ensures that while temporary overspend of 1-2% is theoretically possible during partition, the system never exceeds 105% of budget, which is acceptable for digital advertising SLAs, unlike traditional banking systems.

How do you ensure exactly-once billing when impression tracking pixels fire from user browsers but network failures prevent acknowledgement delivery to the auction server?

Candidates often propose Kafka idempotency or database upserts, missing the end-to-end problem. The solution requires Idempotent Keys generated at the edge using UUIDv7 (time-ordered) embedded in the creative markup. When the browser fires the impression pixel, it includes this key. The edge Nginx layer writes to Apache Pulsar with Deduplication Enabled using a 24-hour window. Pulsar's BookKeeper storage guarantees that duplicate writes with the same key are discarded at the broker level, not the consumer level. Additionally, implement At-Least-Once delivery to a BigQuery staging table partitioned by the idempotent key, with MERGE statements that deduplicate during the ETL process. This dual-layer protection ensures that even if the browser retries the pixel fire 50 times due to 500 errors, the advertiser is charged exactly once.

How do you handle clock skew between geographically distributed bidders when determining auction winners based on time-to-response?

This is subtle. Candidates often suggest NTP or TrueTime (from Spanner), but these add latency. The correct architecture eliminates wall-clock dependencies from the auction logic. Instead of comparing timestamps from DSP responses, use Logical Clocks (Lamport Timestamps) or simply FIFO queues at the edge. When the auction starts, the edge node starts a High-Resolution Timer (Performance.now() in V8 or C++ chrono). DSP responses are ranked by arrival order at the network interface, not by timestamp headers. To handle stragglers, implement a Tunable Timeout using Adaptive Timeout Algorithms that adjust based on historical p99 latency per DSP. For post-hoc analysis and billing disputes, record the Monotonic Clock reading and the UTC Timestamp with uncertainty intervals, storing them in CockroachDB which handles uncertainty windows automatically. This ensures fairness even when one DSP's servers clock is 200ms ahead of another.