1  Chapter 3: The Classical DID Design

1.1 Overview

Dataset: Moser & Voena (2012)moser_voena_didtextbook.dta

The Trading with the Enemy Act (TWEA) allowed US firms to use German-owned patents. Treatment (\(D_{g,t}\)) equals 1 for the 336 subclasses that received German patents, starting in 1919. The dataset contains 289,920 observations (7,248 subclasses × 40 years).


1.2 Panel View

* ssc install panelview, replace
copy "https://raw.githubusercontent.com/anzonyquispe/did_book/main/cc_xd_didtextbook_2025_9_30/Data%20sets/Moser%20and%20Voena%202012/moser_voena_didtextbook.dta" "moser_voena_didtextbook.dta", replace
use "moser_voena_didtextbook.dta", clear
panelview patents twea, i(subclass) t(year) type(treat) title("Treatment Status: Moser & Voena (2012)") ylabel(none) ytitle("")
graph export "figures/ch03_panelview_stata.png", replace width(1200)

Treatment Status (Stata)

library(panelView)
load(url("https://raw.githubusercontent.com/anzonyquispe/did_book/main/cc_xd_didtextbook_2025_9_30/Data%20sets/Moser%20and%20Voena%202012/moser_voena_didtextbook.RData"))
png("figures/ch03_panelview_R.png", width = 1000, height = 600)
panelview(patents ~ twea, data = df, index = c("subclass", "year"), type = "treat",
          main = "Moser & Voena (2012)", ylab = "")
dev.off()

Treatment Status (R)

import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.colors as mcolors
from matplotlib.patches import Patch

df = pd.read_parquet("https://raw.githubusercontent.com/anzonyquispe/did_book/main/cc_xd_didtextbook_2025_9_30/Data%20sets/Moser%20and%20Voena%202012/moser_voena_didtextbook.parquet")
pv = df.pivot_table(index="subclass", columns="year", values="twea", aggfunc="first")
pv_sorted = pv.loc[pv.mean(axis=1).sort_values(ascending=False).index]
cmap = mcolors.ListedColormap(["#D4E6F1", "#00AAAA"])
fig, ax = plt.subplots(figsize=(12, 10))
ax.imshow(pv_sorted.values, aspect="auto", cmap=cmap, interpolation="nearest", vmin=0, vmax=1)
for i in range(0, len(pv_sorted), 10):
    ax.axhline(y=i - 0.5, color="white", linewidth=0.15)
ax.set_xticks(range(0, len(pv_sorted.columns), 1))
ax.set_xticklabels([int(c) for c in pv_sorted.columns], rotation=45, ha="right", fontsize=8)
ax.set_yticks([])
ax.set_xlabel("year")
ax.set_title("Treatment Status: Moser & Voena (2012)", fontsize=16)
ax.legend(handles=[Patch(facecolor="#D4E6F1", edgecolor="gray", label="Control"),
                   Patch(facecolor="#00AAAA", edgecolor="gray", label="Treated")],
          loc="lower center", bbox_to_anchor=(0.5, -0.12), ncol=2)
plt.tight_layout()
plt.savefig("figures/ch03_panelview_Python.png", dpi=150, bbox_inches="tight")
plt.show()

Treatment Status (Python)


1.3 CQ#1: Static TWFE Regression

Using the moser_voena_didtextbook dataset, run the TWFE regression of the number of patents in subclass \(g\) and year \(t\) on subclass and year FEs and the twea treatment, clustering standard errors at the subclass level.

* ssc install reghdfe, replace
copy "https://raw.githubusercontent.com/anzonyquispe/did_book/main/cc_xd_didtextbook_2025_9_30/Data%20sets/Moser%20and%20Voena%202012/moser_voena_didtextbook.dta" "moser_voena_didtextbook.dta", replace
use "moser_voena_didtextbook.dta", clear

* Ejemplo 1: xtreg
xtreg patents twea i.year, fe i(subclass) robust cluster(subclass)

* Ejemplo 2: reghdfe
reghdfe patents twea, absorb(subclass year) cluster(subclass)
Both commands yield the same coefficient:
                           (Std. err. adjusted for 7,248 clusters in subclass)
------------------------------------------------------------------------------
             |               Robust
     patents | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
        twea |   .2882615   .0388871     7.41   0.000     .2120315    .3644915
------------------------------------------------------------------------------
library(fixest); library(haven)
options(digits=7)
load(url("https://raw.githubusercontent.com/anzonyquispe/did_book/main/cc_xd_didtextbook_2025_9_30/Data%20sets/Moser%20and%20Voena%202012/moser_voena_didtextbook.RData"))

# Ejemplo 1
summary(feols(patents ~ twea + i(year) | subclass, data = df, cluster = ~subclass))

# Ejemplo 2
summary(feols(patents ~ twea | subclass + year, data = df, cluster = ~subclass))
Ejemplo 1 (feols with i(year)):
     Estimate Std. Error  t value   Pr(>|t|)
twea 0.288262   0.038887  7.412783 1.3770e-13 ***

Ejemplo 2 (feols with subclass + year):
     Estimate Std. Error t value  Pr(>|t|)
twea 0.288262   0.038887 7.41278 1.377e-13 ***
import pandas as pd
import pyfixest as pf

df = pd.read_parquet("https://raw.githubusercontent.com/anzonyquispe/did_book/main/cc_xd_didtextbook_2025_9_30/Data%20sets/Moser%20and%20Voena%202012/moser_voena_didtextbook.parquet")

# Ejemplo 1
m1 = pf.feols('patents ~ twea + i(year) | subclass', data=df, vcov={'CRV1': 'subclass'})
c1 = m1.coef(); s1 = m1.se(); t1 = m1.tstat(); p1 = m1.pvalue()
for name in c1.index:
    print(f"{name:>20s} | {c1[name]:>12.7f}  {s1[name]:>10.7f}  {t1[name]:>8.2f}   {p1[name]:>8.4f}")

# Ejemplo 2
m2 = pf.feols('patents ~ twea | subclass + year', data=df, vcov={'CRV1': 'subclass'})
c2 = m2.coef(); s2 = m2.se(); t2 = m2.tstat(); p2 = m2.pvalue()
print("\n--- Ejemplo 2 ---")
for name in c2.index:
    print(f"{name:>20s} | {c2[name]:>12.7f}  {s2[name]:>10.7f}  {t2[name]:>8.2f}   {p2[name]:>8.4f}")
Ejemplo 1:
                twea |    0.2882615   0.0388871      7.41     0.0000
     C(year)[T.1901] |    0.0059327   0.0077529      0.77     0.4442
     ...

Ejemplo 2:
                twea |    0.2882615   0.0388871      7.41     0.0000

Result: The coefficient on twea is 0.28826 (\(t = 7.41\), \(p < 0.001\)) across all three languages. Compulsory licensing has a positive and significant effect on US innovation.


1.4 CQ#2: Equivalence between Static TWFE and DID

Verify that \(\hat{\beta}_{fe}\) equals the coefficient on twea in a regression on the treatment group indicator, the post indicator, and twea.

copy "https://raw.githubusercontent.com/anzonyquispe/did_book/main/cc_xd_didtextbook_2025_9_30/Data%20sets/Moser%20and%20Voena%202012/moser_voena_didtextbook.dta" "moser_voena_didtextbook.dta", replace
use "moser_voena_didtextbook.dta", clear
reg patents treatmentgroup post twea, cluster(subclass)
                             (Std. err. adjusted for 7,248 clusters in subclass)
--------------------------------------------------------------------------------
               |               Robust
       patents | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
---------------+----------------------------------------------------------------
treatmentgroup |  -.1780874   .0338152    -5.27   0.000    -.2443751   -.1117996
          post |   .2996666   .0090614    33.07   0.000     .2819036    .3174296
          twea |   .2882615   .0388846     7.41   0.000     .2120364    .3644867
         _cons |   .3143656   .0098186    32.02   0.000     .2951183    .3336129
--------------------------------------------------------------------------------
library(haven); library(fixest)
load(url("https://raw.githubusercontent.com/anzonyquispe/did_book/main/cc_xd_didtextbook_2025_9_30/Data%20sets/Moser%20and%20Voena%202012/moser_voena_didtextbook.RData"))
model2 <- feols(patents ~ treatmentgroup + post + twea,
                data = df, cluster = ~subclass)
summary(model2)
OLS estimation, Dep. Var.: patents
Observations: 289,920
Standard-errors: Clustered (subclass)
                Estimate Std. Error  t value   Pr(>|t|)
(Intercept)     0.314366   0.009819 32.01738  < 2.2e-16 ***
treatmentgroup -0.178087   0.033815 -5.26648 1.4306e-07 ***
post            0.299667   0.009061 33.07069  < 2.2e-16 ***
twea            0.288262   0.038885  7.41326 1.3721e-13 ***
import pandas as pd
import pyfixest as pf

df = pd.read_parquet("https://raw.githubusercontent.com/anzonyquispe/did_book/main/cc_xd_didtextbook_2025_9_30/Data%20sets/Moser%20and%20Voena%202012/moser_voena_didtextbook.parquet")
m2 = pf.feols("patents ~ treatmentgroup + post + twea",
              data=df, vcov={"CRV1": "subclass"})
print(m2.summary())
| Coefficient    |   Estimate |   Std. Error |   t value |   Pr(>|t|) |    2.5% |   97.5% |
|:---------------|------------|--------------|-----------|------------|---------|---------|
| Intercept      |    0.31437 |      0.00982 |  32.01738 |      0.000 |  0.2951 |   0.334 |
| treatmentgroup |   -0.17809 |      0.03382 |  -5.26648 |      0.000 | -0.2444 |  -0.112 |
| post           |    0.29967 |      0.00906 |  33.07069 |      0.000 |  0.2819 |   0.317 |
| twea           |    0.28826 |      0.03889 |   7.41326 |      0.000 |  0.2120 |   0.365 |

Result: The coefficient on twea is identical (0.28826) in both specifications across all three languages.


1.5 CQ#3: Testing Randomized Treatment

Regress patents on the treatment group indicator, restricting the sample to years before 1919.

copy "https://raw.githubusercontent.com/anzonyquispe/did_book/main/cc_xd_didtextbook_2025_9_30/Data%20sets/Moser%20and%20Voena%202012/moser_voena_didtextbook.dta" "moser_voena_didtextbook.dta", replace
use "moser_voena_didtextbook.dta", clear
reg patents treatmentgroup if year<=1918, cluster(subclass)
                             (Std. err. adjusted for 7,248 clusters in subclass)
--------------------------------------------------------------------------------
               |               Robust
       patents | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
---------------+----------------------------------------------------------------
treatmentgroup |  -.1780874   .0338152    -5.27   0.000     -.244375   -.1117997
         _cons |   .3143656   .0098186    32.02   0.000     .2951183    .3336128
--------------------------------------------------------------------------------
library(haven); library(fixest); library(dplyr)
load(url("https://raw.githubusercontent.com/anzonyquispe/did_book/main/cc_xd_didtextbook_2025_9_30/Data%20sets/Moser%20and%20Voena%202012/moser_voena_didtextbook.RData"))
model3 <- feols(patents ~ treatmentgroup,
                data = df %>% filter(year <= 1918), cluster = ~subclass)
summary(model3)
OLS estimation, Dep. Var.: patents
Observations: 137,712
Standard-errors: Clustered (subclass)
                Estimate Std. Error  t value   Pr(>|t|)
(Intercept)     0.314366   0.009819 32.01743  < 2.2e-16 ***
treatmentgroup -0.178087   0.033815 -5.26649 1.4306e-07 ***
import pandas as pd
import pyfixest as pf

df = pd.read_parquet("https://raw.githubusercontent.com/anzonyquispe/did_book/main/cc_xd_didtextbook_2025_9_30/Data%20sets/Moser%20and%20Voena%202012/moser_voena_didtextbook.parquet")
df_pre = df[df["year"] <= 1918].copy()
m3 = pf.feols("patents ~ treatmentgroup",
              data=df_pre, vcov={"CRV1": "subclass"})
print(m3.summary())
| Coefficient    |   Estimate |   Std. Error |   t value |   Pr(>|t|) |    2.5% |   97.5% |
|:---------------|------------|--------------|-----------|------------|---------|---------|
| Intercept      |    0.31437 |      0.00982 |  32.01743 |      0.000 |  0.2951 |   0.334 |
| treatmentgroup |   -0.17809 |      0.03382 |  -5.26649 |      0.000 | -0.2444 |  -0.112 |

Result: The coefficient is −0.17809 (\(t = -5.27\), \(p < 0.001\)). Treatment is not randomly assigned — treated subclasses had fewer patents before the TWEA.


1.6 CQ#4–7: Event-Study TWFE Regression

Estimate the event-study TWFE regression. Test whether pre-trend coefficients are jointly significant. Verify that \(\hat{\beta}_1^{fe}\) equals the DID from equation (3.7).

The event-study dummies reltimeminus1reltimeminus18 and reltimeplus1reltimeplus21 are already included in the dataset. The reference period is 1918 (the last pre-treatment year).

Now estimate the event-study TWFE regression:

Now estimate the event-study TWFE regression:

copy "https://raw.githubusercontent.com/anzonyquispe/did_book/main/cc_xd_didtextbook_2025_9_30/Data%20sets/Moser%20and%20Voena%202012/moser_voena_didtextbook.dta" "moser_voena_didtextbook.dta", replace
use "moser_voena_didtextbook.dta", clear
forvalues l = 1/18 {
    gen reltimeminus`l' = treatmentgroup * (year == 1918 - `l')
}
forvalues l = 1/21 {
    gen reltimeplus`l' = treatmentgroup * (year == 1918 + `l')
}
reg patents i.year treatmentgroup reltimeminus* reltimeplus*, cluster(subclass)

* F-test on pre-trends
test reltimeminus1 reltimeminus2 reltimeminus3 reltimeminus4 reltimeminus5 reltimeminus6 reltimeminus7 reltimeminus8 reltimeminus9 reltimeminus10 reltimeminus11 reltimeminus12 reltimeminus13 reltimeminus14 reltimeminus15 reltimeminus16 reltimeminus17 reltimeminus18
       F( 18,  7247) =    3.79
            Prob > F =    0.0000
* Verify equation (3.7) for l=1 (continues from above)
quietly summarize patents if year==1919 & treatmentgroup==1
scalar m1 = r(mean)
quietly summarize patents if year==1918 & treatmentgroup==1
scalar m2 = r(mean)
quietly summarize patents if year==1919 & treatmentgroup==0
scalar m3 = r(mean)
quietly summarize patents if year==1918 & treatmentgroup==0
scalar m4 = r(mean)
display "DID manual = " m1-m2-(m3-m4)
DID manual (eq 3.7, l=1) = .02352017
library(haven); library(fixest); library(car)
load(url("https://raw.githubusercontent.com/anzonyquispe/did_book/main/cc_xd_didtextbook_2025_9_30/Data%20sets/Moser%20and%20Voena%202012/moser_voena_didtextbook.RData"))
for (l in 1:18) df[[paste0("reltimeminus", l)]] <- as.numeric(df$treatmentgroup == 1 & df$year == 1918 - l)
for (l in 1:21) df[[paste0("reltimeplus", l)]] <- as.numeric(df$treatmentgroup == 1 & df$year == 1918 + l)

event_formula <- as.formula(paste(
  "patents ~ i(year) + treatmentgroup +",
  paste(paste0("reltimeminus", 1:18), collapse = " + "), "+",
  paste(paste0("reltimeplus", 1:21), collapse = " + ")))
model4 <- feols(event_formula, data = df, cluster = ~subclass)

# F-test on pre-trends
library(car)
linearHypothesis(model4, paste0("reltimeminus", 1:18))

# DID manual
m1 <- mean(df$patents[df$year == 1919 & df$treatmentgroup == 1])
m2 <- mean(df$patents[df$year == 1918 & df$treatmentgroup == 1])
m3 <- mean(df$patents[df$year == 1919 & df$treatmentgroup == 0])
m4 <- mean(df$patents[df$year == 1918 & df$treatmentgroup == 0])
cat("DID manual:", m1 - m2 - (m3 - m4))
Chisq = 68.25, p-value = 8.915e-08

DID manual (eq 3.7, l=1): 0.02352017
reltimeplus1 coef:         0.02352017
import pandas as pd
import pyfixest as pf
import numpy as np
from scipy import stats

df = pd.read_parquet("https://raw.githubusercontent.com/anzonyquispe/did_book/main/cc_xd_didtextbook_2025_9_30/Data%20sets/Moser%20and%20Voena%202012/moser_voena_didtextbook.parquet")
for l in range(1, 19):
    df[f"reltimeminus{l}"] = (df["treatmentgroup"] * (df["year"] == 1918 - l)).astype(float)
for l in range(1, 22):
    df[f"reltimeplus{l}"] = (df["treatmentgroup"] * (df["year"] == 1918 + l)).astype(float)

minus_vars = [f"reltimeminus{i}" for i in range(1, 19)]
plus_vars  = [f"reltimeplus{i}"  for i in range(1, 22)]
all_reltime = minus_vars + plus_vars

fml_es = "patents ~ " + " + ".join(all_reltime) + " + treatmentgroup | year"
m4 = pf.feols(fml_es, data=df, vcov={"CRV1": "subclass"})

# F-test on pre-trends
c = m4.coef()
v_mat = pd.DataFrame(m4._vcov, index=c.index, columns=c.index)
pre_names = [x for x in minus_vars if x in c.index]
beta_pre = c[pre_names].values
V_pre = v_mat.loc[pre_names, pre_names].values
F_stat = beta_pre @ np.linalg.inv(V_pre) @ beta_pre / len(pre_names)
p_val = 1 - stats.f.cdf(F_stat, len(pre_names), df["subclass"].nunique() - 1)
print(f"F-test: F({len(pre_names)}, {df['subclass'].nunique()-1}) = {F_stat:.4f}, p = {p_val:.4f}")

# DID manual
m1v = df.loc[(df["year"]==1919) & (df["treatmentgroup"]==1), "patents"].mean()
m2v = df.loc[(df["year"]==1918) & (df["treatmentgroup"]==1), "patents"].mean()
m3v = df.loc[(df["year"]==1919) & (df["treatmentgroup"]==0), "patents"].mean()
m4v = df.loc[(df["year"]==1918) & (df["treatmentgroup"]==0), "patents"].mean()
print(f"DID manual (eq 3.7, l=1): {m1v - m2v - (m3v - m4v):.8f}")
print(f"reltimeplus1 coef:        {m4.coef()['reltimeplus1']:.8f}")
F-test: F(18, 7247) = 3.7917, p = 0.0000

DID manual (eq 3.7, l=1): 0.02352017
reltimeplus1 coef:        0.02352017

Result: Pre-trends are jointly significant (\(F = 3.79\), \(p < 0.001\)). The manual DID (0.02352) matches reltimeplus1 exactly.

1.6.1 Figure 3.2: TWFE Event-Study Plot

copy "https://raw.githubusercontent.com/anzonyquispe/did_book/main/cc_xd_didtextbook_2025_9_30/Data%20sets/Moser%20and%20Voena%202012/moser_voena_didtextbook.dta" "moser_voena_didtextbook.dta", replace
use "moser_voena_didtextbook.dta", clear
reg patents i.year treatmentgroup reltimeminus* reltimeplus*, cluster(subclass)
coefplot, keep(reltimeminus* reltimeplus*) vertical yline(0) xline(18.5, lpattern(dash)) xtitle("Relative time to year before TWEA") ytitle("Effect") title("TWFE Event-study estimates") ciopts(recast(rcap) color(cranberry))
graph export "figures/ch03_fig32_es_twfe.png", replace width(1200)

Stata Event Study
library(fixest); library(haven)
options(digits=7)
load(url("https://raw.githubusercontent.com/anzonyquispe/did_book/main/cc_xd_didtextbook_2025_9_30/Data%20sets/Moser%20and%20Voena%202012/moser_voena_didtextbook.RData"))
reltimeminus_vars <- paste0("reltimeminus", 1:18)
reltimeplus_vars <- paste0("reltimeplus", 1:21)
formula_str <- paste("patents ~", paste(reltimeminus_vars, collapse = " + "), "+",
    paste(reltimeplus_vars, collapse = " + "), "+ treatmentgroup | year")
model4 <- feols(as.formula(formula_str), data = df, cluster = ~subclass)
rt <- c(-(18:1), 0, 1:21)
coefs <- c(rev(coef(model4)[paste0("reltimeminus", 18:1)]), 0,
           coef(model4)[paste0("reltimeplus", 1:21)])
ses <- c(rev(se(model4)[paste0("reltimeminus", 18:1)]), 0,
         se(model4)[paste0("reltimeplus", 1:21)])
png("figures/ch03_fig32_es_twfe_R.png", width = 1000, height = 600)
plot(rt, coefs, type = "b", pch = 19, col = "navy", cex = 0.6,
     xlab = "Relative time to year before TWEA", ylab = "Effect",
     main = "TWFE Event-study estimates", ylim = range(coefs - 1.96*ses, coefs + 1.96*ses))
arrows(rt, coefs - 1.96*ses, rt, coefs + 1.96*ses,
       length = 0.03, angle = 90, code = 3, col = "firebrick")
abline(h = 0, col = "black", lwd = 0.8)
abline(v = 0, col = "black", lty = 2)
dev.off()

R Event Study
import pandas as pd
import pyfixest as pf
import matplotlib.pyplot as plt
import numpy as np

df = pd.read_parquet("https://raw.githubusercontent.com/anzonyquispe/did_book/main/cc_xd_didtextbook_2025_9_30/Data%20sets/Moser%20and%20Voena%202012/moser_voena_didtextbook.parquet")
minus_vars = [f"reltimeminus{i}" for i in range(1, 19)]
plus_vars = [f"reltimeplus{i}" for i in range(1, 22)]
fml = "patents ~ " + " + ".join(minus_vars + plus_vars) + " + treatmentgroup | year"
m4 = pf.feols(fml, data=df, vcov={'CRV1': 'subclass'})
rt = list(range(-18, 0)) + [0] + list(range(1, 22))
c4 = [m4.coef()[f"reltimeminus{i}"] for i in range(18, 0, -1)] + [0] + \
     [m4.coef()[f"reltimeplus{i}"] for i in range(1, 22)]
s4 = [m4.se()[f"reltimeminus{i}"] for i in range(18, 0, -1)] + [0] + \
     [m4.se()[f"reltimeplus{i}"] for i in range(1, 22)]
c4, s4 = np.array(c4), np.array(s4)
fig, ax = plt.subplots(figsize=(10, 6))
ax.plot(rt, c4, 'o-', color='navy', markersize=4, linewidth=1.2)
ax.errorbar(rt, c4, yerr=1.96*s4, fmt='none', ecolor='firebrick', capsize=2, linewidth=1)
ax.axhline(0, color='black', linewidth=0.8)
ax.axvline(0, color='black', linestyle='--', linewidth=0.8)
ax.set_xlabel("Relative time to year before TWEA")
ax.set_ylabel("Effect")
ax.set_title("TWFE Event-study estimates")
plt.tight_layout()
plt.savefig("figures/ch03_fig32_es_twfe_Python.png", dpi=150, bbox_inches="tight")

Python Event Study

Figure 3.2: Effects of compulsory licensing on US innovation, according to an ES-TWFER. Standard errors clustered at the patent subclass level. 95% confidence intervals shown in red.


1.9 CQ#10: Variance of Treatment Effects

Compare the variance of diffpatentswrt1918 in treatment and control groups in 1932.

copy "https://raw.githubusercontent.com/anzonyquispe/did_book/main/cc_xd_didtextbook_2025_9_30/Data%20sets/Moser%20and%20Voena%202012/moser_voena_didtextbook.dta" "moser_voena_didtextbook.dta", replace
use "moser_voena_didtextbook.dta", clear
sdtest diffpatentswrt1918 if year==1932, by(treatmentgroup)

scalar sd_effects = r(sd_2) - r(sd_1)
reg diffpatentswrt1918 treatmentgroup if year==1932
display "CI: [" _b[treatmentgroup] - 1.96*sd_effects ", " _b[treatmentgroup] + 1.96*sd_effects "]"
Variance ratio test
--------------------------------------------------------------
   Group |     Obs      Std. dev.
---------+------------------------
       0 |   6,912      1.772807
       1 |     336      2.011905
--------------------------------------------------------------
    ratio = sd(0) / sd(1)                        f =   0.7764
  Pr(F < f) = 0.0004     2*Pr(F < f) = 0.0008

sd_2 - sd_1 = .23909779
beta = .64221644
CI: [.17358476, 1.1108481]
library(haven); library(dplyr)
load(url("https://raw.githubusercontent.com/anzonyquispe/did_book/main/cc_xd_didtextbook_2025_9_30/Data%20sets/Moser%20and%20Voena%202012/moser_voena_didtextbook.RData"))
df_1932 <- df %>% filter(year == 1932)
var_test <- var.test(diffpatentswrt1918 ~ treatmentgroup, data = df_1932)
print(var_test)

sd_stats <- df_1932 %>% group_by(treatmentgroup) %>%
  summarise(sd = sd(diffpatentswrt1918, na.rm = TRUE))
sd_effects <- sd_stats$sd[2] - sd_stats$sd[1]
effect_reg <- lm(diffpatentswrt1918 ~ treatmentgroup, data = df_1932)
beta_reg <- coef(effect_reg)["treatmentgroup"]
cat("CI: [", beta_reg - 1.96*sd_effects, ",", beta_reg + 1.96*sd_effects, "]")
F = 0.77644, p-value = 0.0008145
SD control: 1.772807
SD treated: 2.011905
sd_2 - sd_1 = 0.2390978
beta = 0.6422164
CI: [ 0.1735848 , 1.110848 ]
import pandas as pd
import pyfixest as pf
from scipy import stats

df = pd.read_parquet("https://raw.githubusercontent.com/anzonyquispe/did_book/main/cc_xd_didtextbook_2025_9_30/Data%20sets/Moser%20and%20Voena%202012/moser_voena_didtextbook.parquet")
df_1932 = df[df["year"] == 1932]
g0 = df_1932.loc[df_1932["treatmentgroup"] == 0, "diffpatentswrt1918"].dropna()
g1 = df_1932.loc[df_1932["treatmentgroup"] == 1, "diffpatentswrt1918"].dropna()
sd0, sd1 = g0.std(ddof=1), g1.std(ddof=1)
sd_effects = sd1 - sd0
F_var = sd0**2 / sd1**2
p_two = 2 * min(stats.f.cdf(F_var, len(g0)-1, len(g1)-1),
                1 - stats.f.cdf(F_var, len(g0)-1, len(g1)-1))
m7 = pf.feols("diffpatentswrt1918 ~ treatmentgroup", data=df_1932)
beta = m7.coef()["treatmentgroup"]
print(f"F = {F_var:.4f}, p = {p_two:.4f}")
print(f"sd_2 - sd_1 = {sd_effects:.6f}")
print(f"beta = {beta:.6f}")
print(f"CI: [{beta - 1.96*sd_effects:.6f}, {beta + 1.96*sd_effects:.6f}]")
SD control: 1.772923
SD treated: 2.011905
sd_2 - sd_1 = 0.238982
F = 0.7765, 2*Pr(F<f) = 0.0008
beta = 0.642216
CI: [0.173812, 1.110620]

Result: The treated group has a larger variance (\(\hat{\sigma}_1 = 2.01 > \hat{\sigma}_0 = 1.77\)). The implied SD of treatment effects is \(\approx 0.239\), and the confidence interval for \(\beta_{14}\) using this SD is \([0.174, 1.111]\).


1.10 CQ#10 (cont.): Placebo Test

Perform a placebo test by comparing the variances of diffpatentswrt1918 across all years.

copy "https://raw.githubusercontent.com/anzonyquispe/did_book/main/cc_xd_didtextbook_2025_9_30/Data%20sets/Moser%20and%20Voena%202012/moser_voena_didtextbook.dta" "moser_voena_didtextbook.dta", replace
use "moser_voena_didtextbook.dta", clear
forvalues i = 1900/1939 {
    quietly sdtest diffpatentswrt1918 if year==`i', by(treatmentgroup)
    display "Year `i': F = " %6.4f r(F) "  p = " %6.4f r(p)
}
Year 1900: F = 3.2735  p = 0.0000
Year 1901: F = 2.5398  p = 0.0000
Year 1902: F = 3.3703  p = 0.0000
Year 1903: F = 3.1775  p = 0.0000
Year 1904: F = 4.0726  p = 0.0000
Year 1905: F = 3.9685  p = 0.0000
Year 1906: F = 3.5803  p = 0.0000
Year 1907: F = 1.7711  p = 0.0000
Year 1908: F = 2.5510  p = 0.0000
Year 1909: F = 3.7689  p = 0.0000
Year 1910: F = 2.2839  p = 0.0000
Year 1911: F = 1.8151  p = 0.0000
Year 1912: F = 3.7112  p = 0.0000
Year 1913: F = 2.2403  p = 0.0000
Year 1914: F = 2.7455  p = 0.0000
Year 1915: F = 3.6764  p = 0.0000
Year 1916: F = 3.2690  p = 0.0000
Year 1917: F = 1.6222  p = 0.0000
Year 1918: F =      .  p =      .
Year 1919: F = 1.7652  p = 0.0000
Year 1920: F = 1.7161  p = 0.0000
Year 1921: F = 2.6409  p = 0.0000
Year 1922: F = 1.6035  p = 0.0000
Year 1923: F = 1.2684  p = 0.0040
Year 1924: F = 2.3221  p = 0.0000
Year 1925: F = 1.6974  p = 0.0000
Year 1926: F = 2.4032  p = 0.0000
Year 1927: F = 1.5314  p = 0.0000
Year 1928: F = 1.7269  p = 0.0000
Year 1929: F = 1.2982  p = 0.0016
Year 1930: F = 1.1443  p = 0.0999
Year 1931: F = 1.5775  p = 0.0000
Year 1932: F = 0.7764  p = 0.0008
Year 1933: F = 0.8284  p = 0.0134
Year 1934: F = 1.1337  p = 0.1256
Year 1935: F = 0.8164  p = 0.0076
Year 1936: F = 0.7675  p = 0.0005
Year 1937: F = 0.8621  p = 0.0522
Year 1938: F = 0.8692  p = 0.0668
Year 1939: F = 0.7569  p = 0.0002
library(haven); library(dplyr)
load(url("https://raw.githubusercontent.com/anzonyquispe/did_book/main/cc_xd_didtextbook_2025_9_30/Data%20sets/Moser%20and%20Voena%202012/moser_voena_didtextbook.RData"))
for (yr in 1900:1939) {
  df_yr <- df %>% filter(year == yr)
  test <- var.test(diffpatentswrt1918 ~ treatmentgroup, data = df_yr)
  cat(sprintf("Year %d: F = %.4f, p = %.4f\n", yr, test$statistic, test$p.value))
}
Year 1900: F = 3.2735, p = 0.0000
Year 1901: F = 2.5398, p = 0.0000
Year 1902: F = 3.3703, p = 0.0000
Year 1903: F = 3.1775, p = 0.0000
Year 1904: F = 4.0726, p = 0.0000
Year 1905: F = 3.9685, p = 0.0000
Year 1906: F = 3.5803, p = 0.0000
Year 1907: F = 1.7711, p = 0.0000
Year 1908: F = 2.5510, p = 0.0000
Year 1909: F = 3.7689, p = 0.0000
Year 1910: F = 2.2839, p = 0.0000
Year 1911: F = 1.8151, p = 0.0000
Year 1912: F = 3.7112, p = 0.0000
Year 1913: F = 2.2403, p = 0.0000
Year 1914: F = 2.7455, p = 0.0000
Year 1915: F = 3.6764, p = 0.0000
Year 1916: F = 3.2690, p = 0.0000
Year 1917: F = 1.6222, p = 0.0000
Year 1918: F =      ., p =      .
Year 1919: F = 1.7652, p = 0.0000
Year 1920: F = 1.7161, p = 0.0000
Year 1921: F = 2.6409, p = 0.0000
Year 1922: F = 1.6035, p = 0.0000
Year 1923: F = 1.2684, p = 0.0040
Year 1924: F = 2.3221, p = 0.0000
Year 1925: F = 1.6974, p = 0.0000
Year 1926: F = 2.4032, p = 0.0000
Year 1927: F = 1.5314, p = 0.0000
Year 1928: F = 1.7269, p = 0.0000
Year 1929: F = 1.2982, p = 0.0016
Year 1930: F = 1.1443, p = 0.0999
Year 1931: F = 1.5775, p = 0.0000
Year 1932: F = 0.7764, p = 0.0008
Year 1933: F = 0.8284, p = 0.0134
Year 1934: F = 1.1337, p = 0.1256
Year 1935: F = 0.8164, p = 0.0076
Year 1936: F = 0.7675, p = 0.0005
Year 1937: F = 0.8621, p = 0.0522
Year 1938: F = 0.8692, p = 0.0668
Year 1939: F = 0.7569, p = 0.0002
import pandas as pd
from scipy import stats

df = pd.read_parquet("https://raw.githubusercontent.com/anzonyquispe/did_book/main/cc_xd_didtextbook_2025_9_30/Data%20sets/Moser%20and%20Voena%202012/moser_voena_didtextbook.parquet")
for yr in range(1900, 1940):
    dfy = df[df["year"] == yr]
    g0y = dfy.loc[dfy["treatmentgroup"] == 0, "diffpatentswrt1918"].dropna()
    g1y = dfy.loc[dfy["treatmentgroup"] == 1, "diffpatentswrt1918"].dropna()
    if len(g0y) > 1 and len(g1y) > 1 and g1y.std(ddof=1) > 0:
        F_v = g0y.std(ddof=1)**2 / g1y.std(ddof=1)**2
        p_v = 2 * min(stats.f.cdf(F_v, len(g0y)-1, len(g1y)-1),
                      1 - stats.f.cdf(F_v, len(g0y)-1, len(g1y)-1))
        print(f"Year {yr}: F = {F_v:.4f}  p = {p_v:.4f}")
Year 1900: F = 3.2740  p = 0.0000
Year 1901: F = 2.5396  p = 0.0000
Year 1902: F = 3.3701  p = 0.0000
Year 1903: F = 3.1773  p = 0.0000
Year 1904: F = 4.0723  p = 0.0000
Year 1905: F = 3.9683  p = 0.0000
Year 1906: F = 3.5801  p = 0.0000
Year 1907: F = 1.7712  p = 0.0000
Year 1908: F = 2.5509  p = 0.0000
Year 1909: F = 3.7687  p = 0.0000
Year 1910: F = 2.2838  p = 0.0000
Year 1911: F = 1.8153  p = 0.0000
Year 1912: F = 3.7110  p = 0.0000
Year 1913: F = 2.2402  p = 0.0000
Year 1914: F = 2.7453  p = 0.0000
Year 1915: F = 3.6762  p = 0.0000
Year 1916: F = 3.2688  p = 0.0000
Year 1917: F = 1.6223  p = 0.0000
Year 1918: (undefined - reference year)
Year 1919: F = 1.7651  p = 0.0000
Year 1920: F = 1.7159  p = 0.0000
Year 1921: F = 2.6407  p = 0.0000
Year 1922: F = 1.6034  p = 0.0000
Year 1923: F = 1.2684  p = 0.0040
Year 1924: F = 2.3220  p = 0.0000
Year 1925: F = 1.6973  p = 0.0000
Year 1926: F = 2.4031  p = 0.0000
Year 1927: F = 1.5313  p = 0.0000
Year 1928: F = 1.7268  p = 0.0000
Year 1929: F = 1.2981  p = 0.0016
Year 1930: F = 1.1442  p = 0.1002
Year 1931: F = 1.5777  p = 0.0000
Year 1932: F = 0.7765  p = 0.0008
Year 1933: F = 0.8284  p = 0.0134
Year 1934: F = 1.1336  p = 0.1258
Year 1935: F = 0.8164  p = 0.0076
Year 1936: F = 0.7675  p = 0.0005
Year 1937: F = 0.8621  p = 0.0522
Year 1938: F = 0.8692  p = 0.0668
Year 1939: F = 0.7569  p = 0.0002

Result: Pre-TWEA years show \(F > 1\) (control group has higher variance), while post-1930 years show \(F < 1\) (treated group variance increases). The pattern is consistent across all three languages.