Difference-in-Differences (DiD) 🖖
Our Example 🏙️
- AI/LLM training program rolls out over time:
- City A gets training in 2023
- City B gets training in 2026
- Idea: Use City B as a control group until it also gets training
- Causal effect:
- Difference in mean wage (t0-t1) in City A - Difference in mean wage (t0-t1) in City B
- The difference-in-differences (DiD)
2×2 DiD Table 🧮
| City B (Control, gets training later) |
\(\bar Y_{B0}\) |
\(\bar Y_{B1}\) |
| City A (Treated in 2023) |
\(\bar Y_{A0}\) |
\(\bar Y_{A1}\) |
DiD estimand:
\[
\tau = (\bar Y_{A1} - \bar Y_{A0}) - (\bar Y_{B1} - \bar Y_{B0})
\]
DiD 📊
DiD 📊
Regression Version ✏️
\[
Y_{it} = \beta_0 + \beta_1 \, Post_t + \beta_2 \, Treat_i
+ \beta_3 (Post_t \times Treat_i) + \varepsilon_{it}
\]
Where:
- \(Post_t\): indicator for time after treatment
- \(Treat_i\): indicator for treated group
- Interaction term: \(Post_t × Treat_i\)
- \(\beta_0\): control baseline
- \(\beta_1\): control time trend
- \(\beta_2\): baseline difference
- \(\beta_3\): extra change in treated → DiD effect
Regression Version ✏️
Core Assumption = Parallel Trends 📉
For DiD to identify a causal effect:
- Outcomes in City A and City B would have followed parallel trends if no training happened
- What we look for:
- Parallel trends before treatment 📉
- Different levels okay
- Clear “bend” in trend for treated group right after training 🚀
- Can’t be tested 🤦♀️
DiD: Things to Watch Out For 👀
Other shocks at the same time?
- New labor laws? tech sector boom? migration changes?
- Must document and test.
Anticipation effects
Did workers change behavior before the rollout because they expected it?
Did different kinds of workers move into or out of treated vs. control cities because of the training rollout?
Spillovers?
- Did nearby cities feel the effect (e.g. commuting, labor market linkages)?
🎮 Pokémon GO & Physical Activity
Study: Howe et al. (2016), BMJ (https://doi.org/10.1136/bmj.i6270)
Question: Did Pokémon GO increase physical activity among young adults?
Treatment: People who started playing Pokémon GO
Control: Similar individuals who did not start playing Pokémon GO
Outcome: Daily step count (from smartphones / wearables)
Design: Difference-in-Differences
📱 Data & Measurement
- Participants recruited on MTurk, ages 18–35, all in the US
- Everyone used an iPhone 6 (standardized step tracking)
- Players uploaded:
- Screenshot showing Pokémon GO installation date
- Screenshots of their daily step counts
- Time window:
- 4 weeks before installation
- 6 weeks after installation
- Non-players: steps before/after the median player installation date
- Analysis: DiD-style regression adjusting for
- Time-invariant differences between players and non-players
- Week-to-week fluctuations affecting everyone
📊 Results
![]()
🎯 What We’ve Learned So Far
We now understand why randomization works:
it breaks the link between confounders and treatment,
giving us clean, unbiased comparisons.
And we’ve seen that DiD works when treatment timing is
as good as random with respect to outcome trends.
The key lesson:
Exogenous variation, whether created by us (RCTs) or found in the world (DiD), gives us causal leverage.
🧭 Other approaches
This idea shows up again and again in causal inference:
- ✂️ Regression Discontinuity:
- Treatment jumps at a cutoff (age, score, date)
- People just above and below the cutoff are as-if randomly assigned
- 🎺 Instrumental Variables:
- Find a variable that changes treatment for reasons unrelated to the outcome
- For example:
- Distance to training center affecting enrollment but not wages directly 🏃♀️
- Cigarette taxes affecting smoking but not lung cancer directly 🚬
✨ The theme
Across DiD, RD, and IV the strategy is:
Find a source of variation in treatment that behaves like randomization.
Next: Matching
A method to improve comparability when we cannot find such natural experiments—
but not a causal identification strategy by itself.