Product Analytics (IT)Product Analyst

What statistical approach will isolate the causal effect of implementing a surge pricing algorithm on demand elasticity and the balance of supply and demand in a two-sided marketplace, when price is an endogenous variable correlated with latent demand, and geographic randomization is not possible due to network effects between market sides within a single region?

Pass interviews with Hintsage AI assistant

Answer to the Question

The historical context dates back to the classic econometric problem of price endogeneity, where observed market data reflect the equilibrium of supply and demand rather than a pure reaction to price. Traditional methods for estimating elasticity through OLS regression produced biased estimates, as high prices were observed precisely when demand was high, creating a positive correlation that masked negative elasticity. Modern product analytics relies on Causal Inference approaches developed in education economics and labor markets, adapted for digital two-sided markets like Uber, Airbnb, or Delivery Hero.

The issue is that direct A/B testing of prices violates the principle of consistency in user experience and creates arbitrage opportunities (users migrate to the control group). Additionally, there is reverse causality: price affects the behavior of suppliers, who redistribute among regions, changing the underlying market equilibrium. The standard treatment with difference in means gives a biased estimate since conditions of high demand (holidays, weather) simultaneously affect both price and willingness to pay.

The optimal solution combines Regression Discontinuity Design (RDD) based on algorithmic activation thresholds with the Instrumental Variables (IV) approach. The RDD methodology utilizes the fact that the vicinity of a threshold (e.g., 1.2x base price at 85% occupancy) creates quasi-experimental randomness in treatment assignment. To enhance validity, a two-stage least squares method (2SLS) is employed, where exogenous shocks (unpredictable weather conditions, sporting events) serve as instruments that affect price but are not directly correlated with individual user preferences. Additionally, the Synthetic Control Method is used to construct a counterfactual region based on a weighted combination of neighborhoods that did not undergo the algorithm implementation.

Real-Life Situation

The case involved a large ready-to-eat food delivery service planning to implement dynamic pricing during peak hours to balance supply and demand among couriers. The base metric—fulfillment rate—was dropping to 70% in the evening hours, leading to user churn. The product team hypothesized that raising prices during peak hours would reduce demand and attract more couriers due to higher pay, but it was necessary to quantitatively assess demand elasticity without disrupting user experience in the test city.

The first option considered was geographic A/B testing by splitting neighboring cities into control and test groups. Pros: pure counterfactual, ease of interpretation, no cross-contamination within the city. Cons: fundamental differences in demand structure between cities (different restaurant densities, different income levels), migration of couriers between cities (violates SUTVA), and inability to scale results to the target metropolis with unique traffic.

The second option was an interrupted time series analysis comparing pre- and post-implementation periods. Pros: working with the entire audience of one city, accounting for seasonality through CausalImpact. Cons: inability to separate the effect of pricing from market growth trends, influence of concurrently conducted marketing campaigns, changing competitive dynamics over the observation period.

The third option was Regression Discontinuity Design using the internal algorithmic threshold for activating the surge multiplier (e.g., price jumps occur at 80% courier occupancy). Pros: local randomness around the threshold (users just above/below the threshold are comparable), isolation of pure price effects from general demand, operating within a single city without external control groups. Cons: Local Average Treatment Effect (LATE) estimation is only valid for marginal users around the threshold, requiring large samples for power, sensitivity to manipulation of the threshold by the algorithm.

A combined solution was chosen: RDD based on the occupancy threshold validated through Instrumental Variables (unexpected weather events as a price instrument) and Synthetic Control at the micro-district level of the city. This allowed for isolating the price effect from the expectation effect (which also increases with occupancy). As a result, it was established that demand elasticity is -0.8 (moderately elastic), but the market balance effect is achieved only at a multiplier of 1.5x or more. This facilitated the optimization of activation thresholds and increased the fulfillment rate to 89% without significant loss of GMV.

What Candidates Often Miss

How to differentiate true price-induced demand shifts from the expectation effect (delay cost), when a price increase correlates with increased delivery time?

The answer requires decomposition of the overall effect through mediation analysis or use of IV with two instruments: one affects only price (algorithmic threshold), another affects only waiting time (external traffic incidents). Junior analysts often conflate these effects, overestimating price elasticity. It is necessary to build a structural model where price and waiting time are endogenous regressors, and demand is a result of their interaction. Without this, businesses make pricing decisions without understanding that part of the conversion drop is caused not by price but by unsatisfactory service (waiting time).

Why does standard elasticity estimation using log-log regression yield biased results in two-sided markets, and how can this be corrected?

In two-sided markets, there is simultaneity bias: price impacts demand, but demand affects price through the surge algorithm mechanism. OLS estimates will be asymptotically biased. The correct approach requires using Two-Stage Least Squares (2SLS), where in the first stage price is predicted by exogenous shocks (weather, events), and in the second—predicted values are used to estimate elasticity. Candidates often ignore the need to verify the relevance of instruments (F-statistic > 10) and their validity (exclusion restriction), leading to invalid causal conclusions.

How to account for network effects (cross-side network effects) between buyers and suppliers when assessing the causal effect of pricing?

Increasing prices attracts more couriers (a positive effect on supply), reducing waiting times, which may offset negative price-induced shifts on demand. This creates a general equilibrium effect that cannot be captured through partial equilibrium. A structural model of a two-sided market (structural two-sided market model) needs to be built or bipartite graph analysis used to trace the migration of suppliers between zones. Without this, analysts may mistakenly reject effective pricing strategies without seeing the compensating effect of improved service quality through reduced delivery time.