---
title: "Standard errors in fetwfe: tight Gaussian CIs, the conservative fallback, and the experimental cluster option"
author: "Gregory Faletto"
date: "`r Sys.Date()`"
output:
  rmarkdown::html_vignette:
    toc: true
    math_method: "mathml"
    output_file: "FETWFE_Inference_Vignette.html"
vignette: >
  %\VignetteIndexEntry{Standard errors in fetwfe: tight Gaussian CIs, the conservative fallback, and the experimental cluster option}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

```{r setup}
library(fetwfe)
```

This vignette documents how `fetwfe()`, `betwfe()`, `etwfe()`, and `twfeCovs()` compute their standard errors, what assumptions those standard errors rely on, and the three values `se_type` can take. The four estimators share the same inferential machinery, so the discussion applies to all of them; we use `fetwfe()` in the running example.

# 1. The headline: tight Gaussian CIs by default under (Ψ-IF)

By default, the package's standard errors --- the `att_se` slot on the returned object and the entries of `catt_ses` --- are the *tight Gaussian variance* from the paper's Theorem `te.asym.norm.thm`(c$'$), valid under the influence-function condition (Ψ-IF) (Faletto 2025, [arXiv:2312.05985](https://arxiv.org/abs/2312.05985), Assumption (Ψ-IF) at paper Eq. `psi.if.assum`). The combined ATT variance has two pieces,

$$
V_1 \;=\; \sigma_\varepsilon^2 \, \boldsymbol{\alpha}^\top \boldsymbol{\Sigma}^{-1} \boldsymbol{\alpha},
\qquad
V_2 \;=\; v_\psi(\boldsymbol{\beta}_0, \mathcal{S}),
$$

where $V_1$ is the Kock-2013 regression-coefficient variance contribution and $V_2$ is the propensity-score variance contribution; the names $V_1$ and $V_2$ are the paper's, and the package exposes both of them by name on the fitted object under `result$internal$variance_components`. Theorem (c$'$) shows that under (Ψ-IF) the same-data confidence interval is asymptotically Gaussian with the **tight** variance $V_1 + V_2$, *identical to the independent-data variance* in part (b) of the theorem. Concretely, the package returns

$$
\widehat{\text{SE}}(\hat{\tau}_{\text{ATT}})
\;=\;
\sqrt{\widehat{\text{Var}}_1 + \widehat{\text{Var}}_2}
\;=\;
\sqrt{\tilde{v}_N \,/\, N},
$$

where $\tilde{v}_N := \hat{v}_N / T$ is the unit-scaled paper-notation variance estimator (paper line 2004), and the Wald confidence interval is

$$
\text{CI}_N(\tilde{v}_N;\,\alpha)
\;=\;
\left[\,
\hat{T}_N \;\pm\; z_{1 - \alpha/2}\,\sqrt{\tilde{v}_N \,/\, N}
\,\right]
$$

per paper Eq. `conf.int.form` (line 1259). The same combined variance backs the per-cohort entries of `catt_ses` and the `catt_df` confidence-interval columns.

## 1.1 Which propensity-score estimators satisfy (Ψ-IF)?

Assumption (Ψ-IF) requires the cohort-weight estimator $\widehat{\boldsymbol{\psi}}_N$ to admit an i.i.d.-across-$i$ unit-level influence-function representation. It is satisfied by

* **Cohort sample proportions** $\widehat{\pi}_g = N_g / N$ --- the package's *default* propensity-score estimator. So out of the box the tight Gaussian CI is asymptotically exact.
* **Multinomial-logit MLE** for $P(W = g \mid \boldsymbol{X})$.
* **Any GLM** fit by maximum likelihood on $W \mid \boldsymbol{X}$.
* **Kernel or series regression** of $\mathbf{1}\{W = g\}$ on $\boldsymbol{X}$.

(See the discussion at paper line 1268: "When (Ψ-IF) holds --- which is the case for every standard parametric or semi-parametric propensity-score estimator listed in Remark `psi.if.coverage` (cohort sample proportions, multinomial logit, any GLM, kernel/series regression) --- Theorem `te.asym.norm.thm`(c$'$) gives asymptotic Gaussianity with the tight variance $V_1 + V_2$.")

The (Ψ-IF) condition fails for **Robins-Rotnitzky-style augmented doubly-robust estimators** that augment the propensity score with outcome residuals. The package does not currently implement these; when it does, users will be able to opt out of the tight CI via `se_type = "conservative"` (see Section 3).

## 1.2 What changed in v1.12.0

Versions $\le$ 1.11.7 of the package returned the *conservative* Cauchy-Schwarz upper bound

$$
\widehat{\text{SE}}(\hat{\tau}_{\text{ATT}})
\;=\;
\sqrt{\widehat{\text{Var}}_1 + \widehat{\text{Var}}_2 + 2\sqrt{\widehat{\text{Var}}_1\,\widehat{\text{Var}}_2}}
$$

as the same-data default (from paper Theorem (c)). Theorem (c$'$) was added in the paper's same-data Gaussianity upgrade, and v1.12.0 switches the package default to match: the tight Gaussian variance is now the headline number. The conservative formula remains available as an opt-in fallback (Section 3) for any users whose propensity-score estimator violates (Ψ-IF).

# 2. The two variance pieces and how the package exposes them

The combined ATT variance decomposes as $V_1 + V_2$, with both pieces estimated by the package and exposed as named slots.

## 2.1 $V_1$: regression-coefficient variance

The regression-coefficient piece is

$$
\widehat{\text{Var}}_1
\;=\;
\frac{\sigma_\varepsilon^2}{NT}\;\psi_{\text{att}}^\top \widehat G^{-1} \psi_{\text{att}},
$$

where $\widehat G^{-1}$ is the Gram inverse on the bridge-selected support and $\psi_{\text{att}}$ encodes the cohort-weighted ATT contrast. The per-cohort SEs (`catt_ses`) have the same form with a cohort selector $\psi_g$ in place of $\psi_{\text{att}}$. This piece is stored on the fitted object as `result$internal$variance_components$att_var_1`; the paper-notation per-unit-scaled version is `result$internal$variance_components$V_1` (with `V_1 = N * att_var_1`).

## 2.2 $V_2$: propensity-score variance

The propensity-score piece accounts for variability from estimating the cohort-membership probabilities $\widehat{\pi}_g$. It scales like $1/N$ and is unrelated to the regression residuals. This piece is stored as `result$internal$variance_components$att_var_2`, with paper-notation per-unit-scaled `V_2 = N * att_var_2`.

## 2.3 The combined variance: `tilde_v_N` and the paper's catalogue

The unit-scaled total variance is

$$
\tilde{v}_N \;=\; N \cdot \widehat{\text{Var}}(\hat{T}_N),
$$

stored as `result$internal$variance_components$tilde_v_N`. The original (un-scaled) paper notation `hat_v_N = T * tilde_v_N` is also exposed.

The paper catalogues six variance estimators at line 2006; the package exposes the four that apply to the ATT (the other two apply to the CATT(x), which connects to the on-hold `predict()` work in issue #33):

| Slot                          | Paper Eq.                                     | Meaning                                     |
|-------------------------------|-----------------------------------------------|---------------------------------------------|
| `tilde_v_N_C`                 | `v.n.r.t.att.const`                           | Fixed-π exact (no propensity-score noise)   |
| `tilde_v_N_C_pi_hat`          | `v.n.r.t.att.rand`                            | Random-π exact (tight Gaussian under (Ψ-IF))|
| `tilde_v_N_C_pi_hat_cons`     | `v.n.r.t.att.rand.cons`                       | Random-π conservative (Cauchy-Schwarz)      |
| `tilde_v_N_cons`              | `var.est.kock.wooldridge.subgauss.cons`       | Subgaussian conservative bound              |

The default `att_se` value is computed from `tilde_v_N_C_pi_hat` (via `sqrt(tilde_v_N_C_pi_hat / N)`); `se_type = "conservative"` switches the headline to `tilde_v_N_C_pi_hat_cons`. When `indep_counts` is supplied (two-sample regime), the two-sample-exact formula `tilde_v_N_C_pi_hat` is in force regardless of `se_type`.

# 3. The conservative fallback: `se_type = "conservative"`

If a user's propensity-score estimator violates (Ψ-IF) --- the canonical example being a Robins-Rotnitzky-style augmented doubly-robust estimator that injects outcome-residual information into the cohort weights --- Theorem (c$'$) no longer applies. The paper's fallback is Theorem (c)'s subgaussian result with the **conservative** variance bound

$$
\widehat{\text{Var}}^{\text{(cons)}}(\hat{\tau}_{\text{ATT}})
\;=\;
\widehat{\text{Var}}_1 + \widehat{\text{Var}}_2 + 2\sqrt{\widehat{\text{Var}}_1\,\widehat{\text{Var}}_2}.
$$

This is the Cauchy-Schwarz upper bound on $V_1 + V_2$ that holds without any influence-function structure on the cohort-weight estimator. Until v1.11.7 the package returned this bound as the default; v1.12.0 demoted it to an opt-in fallback via

```{r, eval = FALSE}
fetwfe(..., se_type = "conservative")
```

The package does not currently implement any non-(Ψ-IF) propensity-score estimator, so for users running the default cohort-sample-proportions estimator there is no practical reason to pass `se_type = "conservative"`. The argument exists for users who supply their own propensity-score estimates (see issue #32) or who want to reproduce the pre-v1.12.0 conservative default for backward compatibility.

A worked comparison on a clean F1-conforming panel:

```{r}
set.seed(2026)
sim_coefs <- genCoefs(G = 3, T = 6, d = 2, density = 0.5, eff_size = 2)
# sim_coefs was built without a seed, so draw from the ambient generator
# (seeded just above); seed = NA selects the ambient generator silently.
sim_data  <- simulateData(
  sim_coefs,
  N = 120,
  sig_eps_sq = 1,
  sig_eps_c_sq = 0.5,
  seed = NA
)

res_default <- fetwfe(
  pdata = sim_data$pdata,
  time_var = sim_data$time_var,
  unit_var = sim_data$unit_var,
  treatment = sim_data$treatment,
  covs = sim_data$covs,
  response = sim_data$response,
  sig_eps_sq = sim_data$sig_eps_sq,
  sig_eps_c_sq = sim_data$sig_eps_c_sq
)
res_cons <- fetwfe(
  pdata = sim_data$pdata,
  time_var = sim_data$time_var,
  unit_var = sim_data$unit_var,
  treatment = sim_data$treatment,
  covs = sim_data$covs,
  response = sim_data$response,
  sig_eps_sq = sim_data$sig_eps_sq,
  sig_eps_c_sq = sim_data$sig_eps_c_sq,
  se_type = "conservative"
)

c(
  default      = res_default$att_se,
  conservative = res_cons$att_se,
  ratio_cons_over_default = res_cons$att_se / res_default$att_se
)
```

The conservative SE is strictly wider; the tight Gaussian SE is the asymptotically-exact one under the package's default cohort-sample-proportions estimator.

# 4. Experimental: cluster-robust standard errors via `se_type = "cluster"`

The package's default standard errors rely on the model-based covariance structure (paper Assumption F1: i.i.d. units, idiosyncratic noise plus a unit random effect). In applied DiD work it is not unusual to suspect one or more violations of F1:

* **Within-unit serial correlation beyond the random effect.** The unit random effect $c_i$ in F1 absorbs a single, time-constant deviation per unit. It does *not* model serial correlation in the idiosyncratic shocks. Bertrand, Duflo, and Mullainathan (2004) is the classic warning that ignoring within-unit serial correlation can drastically understate DiD standard errors.
* **Heteroskedasticity across units.** F1 imposes a single $\sigma_\varepsilon^2$ across all units.
* **Higher-level clustering.** F1 treats units as i.i.d.; if observations within a state-year or industry-year share unobserved shocks, the unit-level F1 variance will understate the true sampling variability.

Starting in version 1.6.0, all four estimators accept an experimental `se_type` argument with value `"cluster"`:

```{r, eval = FALSE}
fetwfe(..., se_type = "cluster")
```

Setting `se_type = "cluster"` swaps the model-based regression-coefficient variance $V_1$ for a **unit-clustered Liang-Zeger CR1 sandwich** computed on the bridge-selected support. The $V_2$ piece and the (Ψ-IF)-tight combination $V_1 + V_2$ are unchanged --- only the $V_1$ formula switches.

## 4.1 The formula

Let $\widehat{S}$ be the support selected by the bridge regression, $X_{\widehat{S}}$ the corresponding design matrix in the coordinate system the regression was solved in (GLS-transformed for ETWFE/twfeCovs, fusion-then-GLS-transformed for FETWFE/BETWFE), and $\widehat\varepsilon$ the residuals from OLS on that selected support. The cluster-robust variance is

$$
V_{\text{CR}}
\;=\;
\frac{N}{N-1}\;
(X_{\widehat{S}}^\top X_{\widehat{S}})^{-1}\;
\left(\sum_{i=1}^N X_{i\cdot\widehat{S}}^\top \widehat\varepsilon_{i\cdot} \widehat\varepsilon_{i\cdot}^\top X_{i\cdot\widehat{S}}\right)\;
(X_{\widehat{S}}^\top X_{\widehat{S}})^{-1},
$$

with units $i = 1, \dots, N$ as clusters and an $N/(N-1)$ small-sample adjustment (matching `sandwich::vcovCL(cadjust = TRUE, type = "HC0")`). The CATT SE for cohort $g$ is $\sqrt{\psi_g^\top V_{\text{CR}} \psi_g}$ (using a zero-padded $\psi_g$ on the full selected support); the ATT regression-coefficient variance is $\psi_{\text{att}}^\top V_{\text{CR}} \psi_{\text{att}}$, replacing $V_1$ above. The propensity-score piece $V_2$ is unchanged because it depends on empirical cohort proportions, not regression residuals.

For FETWFE and BETWFE, `se_type = "cluster"` is only meaningful when `q < 1` (the bridge oracle property is required); for `q >= 1` the cluster path returns `NA` just like the default. ETWFE and `twfeCovs` have no `q` argument, so the cluster path always runs when the Gram matrix is invertible.

## 4.2 Why we call this experimental

The CR1 sandwich is a textbook estimator and the package's implementation matches `sandwich::vcovCL()` to numerical precision on a clean panel without selection. What is *not* yet covered by the paper's theory is:

* Verification that the bridge oracle property of Theorem 6.2 still holds under the relaxed covariance structure that motivates cluster-robust SEs in the first place.
* Sandwich consistency *after model selection*, i.e., that the CR1 sandwich evaluated at the bridge-selected support is a consistent estimator of the true asymptotic variance.

These extensions are mechanically routine but conceptually non-trivial, and they are explicitly outside the package's current scope. Until they land, `se_type = "cluster"` is exposed as an opt-in, clearly-labelled experimental feature.

**Recommendation.** Until the theory lands, treat `se_type = "cluster"` as a sensitivity check: report both `se_type = "default"` and `se_type = "cluster"` in applied work, comment on the gap, and lean on the default for headline numbers.

## 4.3 Worked cluster comparison

```{r}
res_cluster <- fetwfeWithSimulatedData(sim_data, se_type = "cluster")

c(
  default = res_default$att_se,
  cluster = res_cluster$att_se
)
```

On this F1-conforming simulated panel the two SEs are similar by construction: the data-generating process satisfies F1, so the model-based SE is already valid and the cluster-robust SE estimates the same underlying variance. Under a deliberately serially-correlated DGP (or under heteroskedasticity, or higher-level clustering) the cluster-robust SE will typically be larger.

The `print()` and `summary()` methods label the SE so it is clear which one was used:

```{r}
print(res_cluster)
```

The CATT SEs and confidence intervals in `catt_df` are recomputed from the same cluster-robust sandwich; the CATT p-values follow accordingly.

# 5. Choosing the bridge penalty parameter `lambda`

As of v1.13.0, `fetwfe()` and `betwfe()` select the bridge penalty `lambda` via 10-fold cross-validation by default (using `grpreg::cv.grpreg`). The prior default was BIC over the same `grpreg` lambda grid. The change addresses a finite-sample bias issue documented in simulation studies (see issue #164 in the package repository): under the prior BIC default, the overall ATT estimator was biased toward zero at moderate sample sizes, producing 95% confidence intervals whose empirical coverage was as low as 0.00 in some regimes. Cross-validation restores near-nominal coverage in every regime tested and additionally has the lowest MSE in the high-dim regime FETWFE was originally designed for.

The `lambda_star` and `lambda_star_model_size` result slots are populated identically under both selection paths; only the underlying selection criterion differs. The CV path stores its provenance in three new top-level slots: `lambda_selection`, `cv_folds`, and `cv_seed`. The seed defaults to `as.integer(N * T)` so consecutive calls on the same dataset are reproducible without the user having to specify a seed.

To recover the prior BIC behavior --- for reproducing results from analyses run against v1.12.0 or earlier, for example --- pass `lambda_selection = "bic"`:

```r
fit_bic <- fetwfe(..., lambda_selection = "bic")
```

To control the CV fold assignment explicitly, pass `cv_folds` and/or `cv_seed`:

```r
fit_cv <- fetwfe(..., cv_folds = 20L, cv_seed = 42L)
```

# 6. Simultaneous confidence intervals via `simultaneousCIs()`

The package's per-point standard errors and Wald intervals deliver `(1 - alpha)` coverage *per effect*: for any single cohort ATT, or any single event-time coefficient, the CI contains the true value with probability approximately `1 - alpha`. But when you make a claim involving multiple effects jointly --- for example, "no event-time coefficient is significantly positive at the 5% level" --- per-effect coverage does not imply family-wise coverage. The probability that *every* coefficient in a family of `K` effects lies in its per-point CI can be far below `1 - alpha` when `K` is large.

`simultaneousCIs()` returns intervals that maintain family-wise coverage at the requested `1 - alpha`, using the exact joint multivariate-normal critical value $c_{1-\alpha}$ defined as the $(1 - \alpha)$ quantile of $\max_k |Z_k|$ where $Z \sim N(0, \rho)$ and $\rho$ is the correlation matrix of the effect family's joint covariance. Under Theorem (c$'$) (`paper_arxiv.tex:1233`) and Assumption $(\Psi\text{-IF})$ (assumption equation `paper_arxiv.tex:2013`; in-prose discussion of when it implies tight Gaussianity at paper line 1268), this joint Gaussianity is exact asymptotically; under the paper's fixed-dim framing (AE 1(d)), no high-dimensional correction is needed.

For an event-study family on the fitted `res_default` object from the previous sections:

```{r}
sci <- simultaneousCIs(res_default, family = "event_study", alpha = 0.05)
print(sci)
```

The function returns a `simultaneous_cis` S3 object with `$ci` (the interval data frame), `$critical_value` (the simultaneous $c_{1-\alpha}$), `$pointwise_critical_value` ($z_{1-\alpha/2}$, for reference), and `$bonferroni_critical_value` ($z_{1-\alpha/(2K)}$, the conservative family-wise upper bound). When effects are positively correlated --- as they typically are in DiD, because event-study coefficients share the regression-coefficient variance piece --- the simultaneous critical value sits strictly between the pointwise and Bonferroni values, giving tighter intervals than Bonferroni while preserving family-wise coverage.

The four available family options are `"event_study"` (one effect per post-treatment event time), `"cohort"` (one effect per treated cohort), `"all_post_treatment"` (one effect per `(g, t)` cell), and `"custom"` (a user-supplied `K x num_treats` contrast matrix following the `multcomp::glht()` convention; see `?simultaneousCIs`).

The critical value is computed via `mvtnorm::qmvnorm(..., algorithm = mvtnorm::GenzBretz())` (mvtnorm's default quasi-Monte Carlo integrator), sub-second on a typical fit for any realistic FETWFE family (no `K` cap of practical concern). The function is deterministic in its inputs: the same fit plus the same family plus the same `alpha` plus the same `contrasts` produces the same critical value across calls. This is achieved by save/restoring the caller's `.Random.seed` and using a fixed internal seed immediately before the `qmvnorm()` call; the function does not mutate the caller's RNG state.

As of version 1.16.0 the `mvtnorm` package is an `Imports` dependency (it ships with the package), because simultaneous bands are now computed by default at fit time --- see the `ci_type` subsection below. (In versions 1.15.x `mvtnorm` was in `Suggests` and only loaded when `simultaneousCIs()` was called.) The `plot.simultaneous_cis` method additionally requires `ggplot2`, which is in `Suggests`; install it with `install.packages("ggplot2")` if you have not already.

The pattern matches `did::aggte(cband = TRUE)` from the Callaway-Sant'Anna `did` package, which computes simultaneous bands via the CCK (2013) multiplier bootstrap. Asymptotically the two critical values are identical under fixed-dim Gaussianity; the parametric `simultaneousCIs()` path here is deterministic and sub-second, whereas the multiplier-bootstrap path is stochastic and resampling-based.

## Simultaneous bands are the default: the `ci_type` argument

As of version 1.16.0, the confidence-interval bounds that `fetwfe()` / `etwfe()` / `betwfe()` / `twfeCovs()` *return and display* are simultaneous by default --- matching `did`'s `cband = TRUE` default. A new fit-time argument `ci_type = c("simultaneous", "pointwise")` controls this:

* `ci_type = "simultaneous"` (the default) makes the returned `catt_df$ci_low` / `catt_df$ci_high` the **cohort-family** simultaneous bounds, and `eventStudy()`'s `ci_low` / `ci_high` the **event-study-family** simultaneous bounds --- the very bounds `simultaneousCIs(fit, family = "cohort")` / `simultaneousCIs(fit, family = "event_study")` return. `print()`, `summary()`, and `plot()` display these (labeled `[simultaneous 95% CI]`), and `broom::tidy()` (on the fit, on `eventStudy()`, and on `cohortStudy()`) surfaces them too.
* `ci_type = "pointwise"` restores the pre-1.16.0 per-effect Wald intervals everywhere (the bounds you would get from `estimate ± qnorm(1 - alpha/2) * se`).

The interval bounds **and** the per-cohort p-values (`p_value`) follow `ci_type`. Under `"simultaneous"` the `p_value` is the single-step max-T multiplicity-adjusted (family-wise) p-value matching the band; under `"pointwise"` it is the per-effect Wald p-value. The two correspond exactly: a coefficient lies outside its `1 - alpha` simultaneous band iff its adjusted `p_value` is `< alpha` (version 1.18.0, #200). The standard errors (`se`), selection flags (`selected`), and the overall-ATT confidence interval and p-value (a single scalar, `K = 1`, with no family and hence no widening) are **identical** under both settings --- for the bounds the only difference is the critical-value multiplier (`qnorm(1 - alpha/2)` for pointwise vs the joint $c_{1-\alpha}$ for simultaneous).

```{r}
fit_sim <- fetwfeWithSimulatedData(sim_data) # ci_type = "simultaneous" (default)
fit_pw <- fetwfeWithSimulatedData(sim_data, ci_type = "pointwise")

# Same point estimates and SEs; only the bounds differ (simultaneous wider).
cbind(
  estimate = fit_sim$catt_df$estimate,
  se = fit_sim$catt_df$se,
  simul_width = fit_sim$catt_df$ci_high - fit_sim$catt_df$ci_low,
  pointwise_width = fit_pw$catt_df$ci_high - fit_pw$catt_df$ci_low
)
```

Because the tidiers pass the object's bounds straight through, every CI surface in the package agrees under the default: `broom::tidy(fit)`'s cohort rows, `summary(fit)`, `cohortStudy(fit)`, and `plot(fit, type = "catt")` all report the same simultaneous bounds. (The overall-ATT row of `broom::tidy(fit)` stays the scalar pointwise interval.) When standard errors are unavailable (`q >= 1` for the bridge estimators, or a rank-deficient design) the bounds are `NA` under both settings, and the fit still succeeds --- the simultaneous path degrades cleanly to the (`NA`) pointwise bounds rather than throwing an error.

# 7. Selection consistency as an implicit zero-effect test

A question applied users ask constantly is: *is this cohort's treatment effect distinguishable from zero --- and can I conclude that it actually **is** zero?* For `fetwfe()` (and `betwfe()`), the answer is built into the estimator. The bridge penalty's *restriction selection consistency* (Faletto 2025, Theorem 6.2) makes the `selected` flag on `catt_df` an **implicit hypothesis test** of $H_0: \tau_{\text{ATT}}(g) = 0$, for free, with no extra computation. The mapping is:

* `selected = TRUE` (estimate $\neq 0$) is the asymptotic analogue of **reject $H_0$**: the effect is nonzero.
* `selected = FALSE` (estimate exactly $0$) is the asymptotic analogue of **fail to reject $H_0$** --- and, under Theorem 6.2, it is the *stronger* statement that the truth is zero with probability tending to one.

The conclusion for a selected-out cohort is delivered by the `selected` flag, **not** by a p-value. A cohort the penalty zeros out has `estimate == 0`, `se == 0`, and `p_value == NA` by construction --- there is no Wald p-value to read, and the inferential content lives entirely in `selected`. (A condensed, first-time-user-oriented version of this discussion appears under "Testing the zero-effect null" in the main package vignette; the treatment here is the deeper, inference-focused one.)

## Why this is stronger than a confidence interval on an unregularized estimator

For an unregularized estimator --- OLS, or plain `etwfe()` --- the coefficient estimate $\hat\beta$ is a continuous random variable, so $\hat\beta = 0$ *exactly* is a **measure-zero event**: it essentially never happens, and would carry no information about the truth if it did. The only zero-effect inference such an estimator offers is "the confidence interval contains $0$," i.e. *fail to reject* --- never *accept*. Selection consistency changes the picture qualitatively. Under Theorem 6.2, $\hat\beta_j = 0$ exactly with probability $\to 1$ when the truth is zero, and $\hat\beta_j \neq 0$ with probability $\to 1$ when the truth is nonzero, so observing $\hat\beta_j = 0$ *is* informative. The paper states the consequence directly: if $\tau_{\text{ATT}}(g,t) = 0$ then $\lim_{N\to\infty} P(\hat\tau_{\text{ATT}}(g,t) = 0) = 1$, which it calls "stronger than convergence in probability to $0$" (Faletto 2025, around the statement of Theorem 6.2).

It is worth being precise about *which* part of the theory delivers the test, because the paper is careful here. A conventional Wald statistic **cannot** test the zero null in this setting: when the relevant effect is truly zero the statistic is not asymptotically normal (it converges in probability to $0$ even after scaling by $\sqrt{N}$), so there is no valid Wald test of $H_0: \tau = 0$ to construct. That is *precisely why* the right tool is selection consistency and the `selected` flag, rather than a Wald statistic on the point estimate. The complementary half of the theory backs the effects the estimator *retains*: when an effect is truly nonzero, FETWFE is $\sqrt{N}$-consistent and asymptotically Gaussian on the selected support (the oracle property, Theorem 6.3), with the practical Wald confidence intervals and p-values of Theorem 6.4 --- the same machinery documented in Sections 1--2. So `selected = TRUE` cohorts come with the usual valid CIs and finite p-values, while `selected = FALSE` cohorts carry the stronger, selection-based zero-effect conclusion.

## The `q < 1` caveat

This interpretation rests on selection consistency, which the paper proves **only for the bridge exponent $q \in (0, 1)$**. The package default $q = 0.5$ satisfies it. For $q = 1$ (the lasso) selection consistency of FETWFE is *not* proven, so the implicit-test framing does not apply; for $q = 2$ (ridge) the solver never zeros coefficients at all, so `selected` is trivially `TRUE` everywhere and conveys nothing. Read the `selected` flag as a zero-effect test only under the default (or any $q < 1$) bridge penalty.

## A controlled-truth worked example

The cleanest way to see the implicit test is on simulated data where the truth is known. We draw coefficients in which the *middle* treated cohort has a true average treatment effect of exactly zero while its two neighbors are nonzero, then check whether FETWFE selects out exactly the zero cohort. `getTes()` reports the known per-cohort truth for comparison.

```{r implicit-test-demo, message = FALSE}
# Known truth: cohort 3 (the middle treated cohort) has a true effect of
# exactly zero; cohorts 2 and 4 are nonzero.
coefs <- genCoefs(G = 3, T = 6, d = 2, density = 0.2, eff_size = 2, seed = 3)
getTes(coefs)$actual_cohort_tes

dat <- simulateData(coefs, N = 120, sig_eps_sq = 1, sig_eps_c_sq = 0.5, seed = 3)
res <- fetwfeWithSimulatedData(dat, verbose = FALSE)

# The implicit test: the `selected` and `p_value` columns of catt_df.
res$catt_df[, c("cohort", "estimate", "se", "p_value", "selected")]
```

The true effects are $(-1.6, 0, 1.333)$: cohort 3's effect is exactly zero. FETWFE selects out precisely that cohort --- cohort 3 returns `selected = FALSE` with an estimate of exactly $0$ and `p_value = NA` --- while it recovers the two truly-nonzero cohorts with `selected = TRUE` and finite standard errors and p-values. Reading the `selected` column through the test mapping above: cohort 3 is *"fail to reject (and conclude zero),"* cohorts 2 and 4 are *"reject."* The estimator performed the zero-effect test automatically, and its verdict matches the known truth.

## Scope: this is the *post-treatment* zero-effect test

The package currently estimates only post-treatment treatment effects (event times $e \geq 0$), so the demonstration above is specifically the **post-treatment** zero-effect test, surfaced through `catt_df`. The same selection-consistency logic would support two further applications once the corresponding extended regression specifications exist: (a) selecting out pre-treatment placebo coefficients would give a **parallel-trends test**, and (b) selecting out heterogeneous-trend augmenting parameters would give a **homogeneous-trends test**. The package does not estimate those terms today, so this vignette makes no claim to perform either test; they are noted only as the natural extensions the same `selected`-as-implicit-test reasoning will cover when the specifications land.

# References

Bertrand, M., Duflo, E., & Mullainathan, S. (2004). "How much should we trust differences-in-differences estimates?" *Quarterly Journal of Economics* 119(1), 249--275.

Faletto, G. (2025). "Fused Extended Two-Way Fixed Effects for Difference-in-Differences with Staggered Adoptions." *arXiv preprint* [arXiv:2312.05985](https://arxiv.org/abs/2312.05985).

Kock, A. B. (2013). "Oracle Efficient Variable Selection in Random and Fixed Effects Panel Data Models." *Econometric Theory* 29(1), 115--152.

Liang, K.-Y., & Zeger, S. L. (1986). "Longitudinal data analysis using generalized linear models." *Biometrika* 73(1), 13--22.