Answer to the question

Historically, the issue of loading speed was viewed exclusively as an engineering metric, but with the implementation of Core Web Vitals into search algorithms and the increase in mobile traffic, it became clear that performance is a product feature. Classic approaches to evaluating the impact of speed confront fundamental endogeneity: users with fast devices and stable internet convert better regardless of website optimization, which creates a spurious correlation.

The issue is compounded with the use of Edge Computing and modern CDN architectures, where consistent traffic segmentation into groups cannot be guaranteed due to aggressive caching on edge servers. Additionally, there is a self-selection effect: users with slow connections often leave the site before it loads, distorting the sample distribution and making pure A/B comparisons impossible.

The optimal solution combines Regression Discontinuity Design (RDD) at the threshold of "good" performance (e.g., LCP = 2.5 seconds) with instrumental variables (IV) as a tool. Geographic proximity to the nearest edge server or connection type (3G vs 4G) serves as the instrumental variable, which randomly influences speed but does not directly correlate with the intention to purchase. For cohort analysis, the synthetic control method is used, where the control group is constructed from historical data of users with a similar structure of devices and geolocations, allowing isolation of the pure effect of optimization from seasonality and macro trends.

Real-life Situation

In a large e-commerce project, the frontend team revolutionized the performance: they migrated images to modern formats (WebP, AVIF) with lazy-loading and optimized the critical rendering path, reducing LCP from 4.2 seconds to 1.8 seconds among users with good connections. The product team recorded a 12% increase in conversion in the post-release slice, but doubts arose about causal relationships, as a seasonal advertising campaign was simultaneously launched and the product catalog was updated.

Option 1: Naive Cohort Comparison Before and After

Analysts proposed comparing the conversion of users one week before optimization to one week after, stratifying by regions. Pros: simplicity of implementation and no need for complex infrastructure. Cons: complete disregard for seasonality (pre-holiday week), differences in audience composition (new users came from advertising with a different intent), and survivorship bias — slow users "vanished" from the post-sample, creating an illusion of growth.

Option 2: Correlational Analysis of Speed vs Conversion

The second approach suggested building a regression where the independent variable was the actual LCP of the user, and the dependent variable was the fact of conversion. Pros: use of all available data and granularity to the session level. Cons: fatal endogeneity: users with expensive flagship devices and fast internet are already wealthier and more motivated to buy, while users with cheap devices on 3G have a low intent-to-buy regardless of site speed, leading to an upward bias of 40-60%.

Option 3: Regression Discontinuity Design with a Geographical Instrument

The team chose a hybrid method: they used the distance to the nearest edge server as an instrumental variable, which correlates with speed but not with purchasing behavior. Users on the boundary of coverage (where the signal "breaks" and speed sharply drops to 2.6-2.8 seconds LCP) formed a locally random sample around the 2.5-second threshold. By applying Local Average Treatment Effect (LATE) within a ±0.3 second window from the threshold, they measured the pure effect of speed improvements for compliers (users whose speed changed due to infrastructure rather than device).

Chosen Solution and Result

The RDD+IV approach was implemented with additional filtering of return users through an analysis of localStorage for cached resources. The final estimate showed that the true incremental effect of optimization was +8.5% for new users and +3.2% for returning users (where the novelty effect is lower), which justified investments in Edge Computing infrastructure with a 340% ROI over a year.

What Candidates Often Miss

Why does standard OLS regression of performance vs conversion yield biased estimates, and what mechanism of endogeneity dominates here?

The answer lies in double self-selection (double selection bias): firstly, users with slow devices systematically fall less often into the sample of "successful sessions" (they drop out before loading), creating truncation bias; secondly, internet speed correlates with socio-economic status and geography, which directly impact purchasing power. Without instrumental variables or RDD, the regression mixes the effect of "fast internet as a marker of wealth" with the effect of "fast site as a trigger for conversion," overestimating the true causal effect by 1.5-2 times.

How does client-side caching and return visits distort the evaluation of optimization effect in longitudinal analysis, and what method allows filtering out treatment contamination?

Return visitors who visited the site before optimization have old, heavyweight resources in the HTTP-cache or Service Worker, so for them, the "treatment" (new fast version) is partially or completely not applied, creating contamination between treatment and control. Candidates often forget to check If-None-Match headers or analyze first-party cookies with the timestamp of the first visit. The correct approach is intent-to-treat (ITT) analysis, separating into "clean new sessions" (new users + cleared cache) vs. "contaminated returning," or using difference-in-differences (DiD) with fixed user effects, which isolates within-user change from between-user selection.

What is the difference between ITT analysis (Intent-to-Treat) and TOT analysis (Treatment-on-the-Treated) when assessing the effect of Core Web Vitals, and why is it critical to report ITT for product metrics when planning scaling?

ITT measures the effect for the entire population, including those who did not receive the speed improvement (e.g., users on 2G or with JavaScript disabled), while TOT (or LATE in IV context) measures the effect only for "compliers" — those who actually benefited from optimization. Candidates often mistakenly report a TOT estimate to the business (+15% conversion for those who would benefit from fast loading), but when scaling optimization across 100% of traffic, the real effect will be closer to ITT (+6-8%), as part of the audience technically cannot receive improvement (outdated devices, slow networks). For business planning and revenue forecasting, it is critical to use a conservative ITT estimate to avoid overcommitment errors.