---
title: "Simulation Vignette for FETWFE: From Coefficients to True Treatment Effects"
author: "Gregory Faletto"
date: "`r Sys.Date()`"
output:
  rmarkdown::html_vignette:
    toc: true
    math_method: "mathml"
    output_file: "FETWFE_Simulation_Vignette.html"
vignette: >
  %\VignetteIndexEntry{Simulation Vignette for FETWFE: From Coefficients to True Treatment Effects}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
knitr::opts_chunk$set(echo = TRUE)

```

```{r setup}
# Load necessary libraries
library(dplyr)
library(fetwfe)
```

# Introduction

This vignette demonstrates how to conduct simulation studies with the `{fetwfe}` package. In particular, we will:

- **Generate a vector of coefficients** with `genCoefs()`. These coefficients produce unit‐and‐time specific responses that respect difference‐in‐differences assumptions (e.g., conditional parallel trends) and the sparsity assumptions behind FETWFE.
- **Simulate a panel data set** with `simulateData()`.
- **Estimate treatment effects** via the FETWFE estimator with `fetwfeWithSimulatedData()`.
- **Extract the true treatment effects** using `getTes()`.

The workflow here follows the simulation‐study design outlined in [the paper](https://arxiv.org/abs/2312.05985), so you may wish to skim its setup section for additional context.

# Simulation Workflow Using Piping

Below is a complete simulation pipeline, step by step.

## Step 1: Generate Simulation Coefficients

The `genCoefs()` function returns an object of class `"FETWFE_coefs"`, containing both the coefficient vector and its simulation parameters. The `fusion_structure` argument (mirroring `fetwfe()`) additionally controls the *basis* in which the true treatment effects are sparse — `"cohort"` (the default) or `"event_study"`; see the vignette `vignette("fusion_structure_vignette", package = "fetwfe")`. In this example we set:

- **G** (number of treated cohorts) = 3
- **T** (number of time periods) = 6
- **d** (number of covariates) = 2  
- **density** (sparsity level) = 0.1  
- **eff_size** (effect‐size multiplier) = 2  


```{r}
# Generate the coefficient object for simulation
sim_coefs <- genCoefs(
  G         = 3, 
  T         = 4, 
  d         = 2, 
  density   = 0.1, 
  eff_size  = 2, 
  seed      = 101
)
```

(Again, for more details on the meaning of these parameters, see the simulation study section of [the paper](https://arxiv.org/abs/2312.05985).)


## Step 2: Simulate Panel Data

Next, we simulate a panel data set using the generated coefficient object with the `simulateData()` function. With `simulateData()`, we generate:

* `N` units, each assigned to one of the cohorts
* Time‐invariant covariates drawn from a specified distribution
* Outcomes at times 1 through T, using our simulated coefficients

Here we choose:

- `N` (number of units) as 60,
- `sig_eps_sq` (observation-level noise variance) as 1,
- `sig_eps_c_sq` (unit-level noise variance) as 1, and
- use the default `"gaussian"` distribution for the covariates.

```{r}
# Simulate panel data based on the coefficients
sim_data <- simulateData(
  sim_coefs,
  N = 60,
  sig_eps_sq = 1,
  sig_eps_c_sq = 1,
  distribution = "gaussian",
  seed = 101
  )
```

The dataframe is stored in `sim_data$pdata`, so we can take a quick look at the results:

```{r}
head(sim_data$pdata)
```

## Step 3: Run the FETWFE Estimator on Simulated Data

We then run the estimator on the simulated data using `fetwfeWithSimulatedData()`. (We could get the same results by manually unpacking `sim_data` and passing the arguments appropriately to `fewtfe()`. `fetwfeWithSimulatedData()` is just a wrapper function that takes care of this for us.)

```{r}
result <- fetwfeWithSimulatedData(sim_data)
```

We can now extract the results from `result` in the same way that we can with the standard `fetwfe()` function.

```{r}
summary(result)

```

## Step 4: Extract True Treatment Effects

To evaluate the estimated ATT, we can compute the true treatment effects using the original coefficient object. The `getTes()` function extracts both the overall average treatment effect and the cohort-specific effects.

```{r}
# Extract the true treatment effects
true_tes <- getTes(sim_coefs)

# Print the true overall treatment effect
cat("True Overall ATT:", true_tes$att_true, "\n")

# Print the cohort-specific treatment effects
print(true_tes$actual_cohort_tes)
```

We can use this to calculate metrics to evaluate our estimated treatment effect, like squared error:

```{r}
squared_error <- (result$att_hat - true_tes$att_true)^2

cat("Squared error of ATT estimate:", squared_error, "\n")
```

## Combining the Workflow in One Pipeline

You can also chain the simulation functions together with the pipe operator. The following code generates the coefficients, simulates the data, and runs the estimator all in one pipeline:

```{r}
coefs <- genCoefs(G = 3, T = 4, d = 2, density = 0.1, eff_size = 2, seed = 2025)

result_piped <- coefs |>
  simulateData(N = 60, sig_eps_sq = 1, sig_eps_c_sq = 1, seed = 2025) |>
  fetwfeWithSimulatedData()

cat("Estimated Overall ATT from piped workflow:", result_piped$att_hat, "\n")

true_tes_piped <- coefs |> getTes()

# Print the true overall treatment effect
cat("True Overall ATT:", true_tes_piped$att_true, "\n")

# Print the squared estimation error
squared_error_piped = (result_piped$att_hat - true_tes_piped$att_true)^2

cat("Squared estimation error:", squared_error_piped, "\n")
```

# Covariate-dependent cohort assignment

The simulator workflow above generates panels in which cohort assignment is independent of the covariates. Each unit is assigned to one of the $G + 1$ cohorts with uniform probability $1/(G + 1)$, regardless of its covariate values. This is the simplest data-generating process (DGP) for cohort assignment, but in many realistic settings, cohort membership is correlated with observable characteristics — adoption decisions depend on firm size, industry, region, and so on.

Version 1.14.0 of `{fetwfe}` introduces two covariate-dependent DGPs to `genCoefs()` via the new `assignment_type` and `assignment_strength` arguments:

- `assignment_type = "marginal"` (default): the original uniform marginal DGP. Preserves pre-1.14.0 behavior exactly.
- `assignment_type = "multinomial"`: a multinomial-logit propensity-score model. Each unit's cohort probability is
  $$\pi_g(x) = \frac{\exp(\gamma_g^\top x)}{\sum_{g' = 0}^{G} \exp(\gamma_{g'}^\top x)}, \quad g = 0, 1, \ldots, G,$$
  with $\gamma_0 \equiv 0$ for the never-treated reference cohort and $\gamma_g$ drawn from a Gaussian (scaled by `assignment_strength`) for each treated cohort $g$.
- `assignment_type = "ordered"`: an ordered-logit (proportional-odds) model that exploits the natural ordering of the adoption times. The ordinal scale runs through treated cohorts in temporal-adoption order with never-treated at the top: slot 1 is cohort 1 (the earliest-adopting cohort), slot 2 is cohort 2, ..., slot $G$ is cohort $G$ (the latest-adopting cohort), and slot $G + 1$ is never-treated. Cumulative probabilities follow the standard McCullagh proportional-odds parameterization with the subtraction convention,
  $$P(W' \le g \mid x) = \mathrm{plogis}(\alpha_g - \gamma^\top x), \quad g = 1, \ldots, G,$$
  with a single shared $\gamma$ vector and cutpoints $\alpha_g$ chosen so the marginal cohort probabilities are approximately uniform $1/(G + 1)$. As $\gamma^\top x$ increases monotonically, the modal cohort traverses the ordinal scale in order: cohort 1 $\to$ cohort 2 $\to \ldots \to$ cohort $G$ $\to$ never-treated. Low $\gamma^\top x$ concentrates mass at the *bottom* of the scale (earliest-adopting cohorts); high $\gamma^\top x$ concentrates mass at the *top* (never-treated, with later cohorts at intermediate values). $\gamma$ is the propensity for being *later* on the treatment-timing scale.

Both DGPs are the canonical reference models named in the FETWFE paper (Faletto 2025, line 1016).

## Multinomial-logit DGP

We generate a panel where adoption probabilities depend on the covariates via a multinomial-logit model. The `assignment_strength` parameter scales the logit coefficients: at `assignment_strength = 0`, the DGP collapses to the marginal case (uniform $1/(G + 1)$ probabilities, regardless of $X$); larger values produce stronger covariate-cohort coupling.

```{r}
sim_coefs_mn <- genCoefs(
  G                   = 3,
  T                   = 4,
  d                   = 2,
  density             = 0.1,
  eff_size            = 2,
  assignment_type     = "multinomial",
  assignment_strength = 1.0,
  seed                = 101
)
```

The returned `FETWFE_coefs` object now carries three new slots:

```{r}
sim_coefs_mn$assignment_type
sim_coefs_mn$assignment_strength
str(sim_coefs_mn$assignment_coefs)
```

`assignment_coefs$coefs` is the $d \times G$ matrix of $\gamma_g$ vectors; column $g$ holds the propensity coefficients for treated cohort $g$. We can simulate a panel and verify that the empirical cohort proportions reflect covariate-cohort coupling rather than uniform marginal assignment:

```{r}
sim_mn <- simulateData(
  sim_coefs_mn,
  N            = 200,
  sig_eps_sq   = 1,
  sig_eps_c_sq = 1,
  seed         = 101
)
cat("Empirical cohort proportions:\n")
print(round(sim_mn$assignments / sum(sim_mn$assignments), 3))
```

We can fit FETWFE on this panel exactly as before:

```{r}
fit_mn <- fetwfeWithSimulatedData(sim_mn, verbose = FALSE)
cat("Estimated overall ATT:", fit_mn$att_hat, "\n")
```

## Ordered-logit DGP

The proportional-odds variant uses a single shared $\gamma$ vector and $G$ cutpoints. It is appropriate when the cohorts have a natural ordering — earlier adopters being "more like" later adopters than never-treated units, for instance. The ordinal scale runs in temporal order with never-treated at the top: slot 1 = cohort 1 (first treated), ..., slot $G$ = cohort $G$ (last treated), slot $G + 1$ = never-treated. As $\gamma^\top x$ increases, the modal cohort moves up the scale from cohort 1 toward never-treated.

```{r}
sim_coefs_ord <- genCoefs(
  G                   = 3,
  T                   = 4,
  d                   = 2,
  density             = 0.1,
  eff_size            = 2,
  assignment_type     = "ordered",
  assignment_strength = 1.0,
  seed                = 101
)
```

```{r}
str(sim_coefs_ord$assignment_coefs)
```

Here `assignment_coefs$coefs` is a length-$d$ vector (the shared $\gamma$) and `cutpoints` is the length-$G$ vector of cutpoints $\alpha_g$. The cutpoints are chosen by root-finding on the marginal-uniform condition: for each $g \in 1, \ldots, G$, the package solves for $\alpha_g$ such that $E_X[\mathrm{plogis}(\alpha_g - \gamma^\top X)] = g / (G + 1)$. This guarantees that the marginal cohort proportions are approximately uniform $1/(G + 1)$ for any choice of `assignment_strength` — the strength controls the *conditional* coupling between $X$ and $W$, not the marginal cohort proportions.

```{r}
sim_ord <- simulateData(
  sim_coefs_ord,
  N            = 200,
  sig_eps_sq   = 1,
  sig_eps_c_sq = 1,
  seed         = 101
)
cat("Empirical cohort proportions (ordered):\n")
print(round(sim_ord$assignments / sum(sim_ord$assignments), 3))
```

## Effect of assignment strength

The `assignment_strength` parameter controls how strongly the covariates predict cohort membership. Below is a side-by-side comparison of the cohort distribution at three strength levels:

```{r}
for (s in c(0.0, 1.0, 5.0)) {
  coefs_s <- genCoefs(
    G                   = 3,
    T                   = 4,
    d                   = 2,
    density             = 0.1,
    eff_size            = 2,
    assignment_type     = "multinomial",
    assignment_strength = s,
    seed                = 101
  )
  sim_s <- simulateData(
    coefs_s,
    N            = 200,
    sig_eps_sq   = 1,
    sig_eps_c_sq = 1,
    seed         = 101
  )
  cat(sprintf(
    "strength = %.1f -> proportions: %s\n",
    s,
    paste(
      round(sim_s$assignments / sum(sim_s$assignments), 3),
      collapse = ", "
    )
  ))
}
```

At `assignment_strength = 0` the cohort proportions are approximately uniform $1/(G + 1) = 0.25$. At higher strength, the proportions deviate from uniform because the covariate distribution drives the assignment.

## Nonlinear propensity via interactions

Sometimes the propensity model needs to be nonlinear in the covariates — for example, to stress-test estimators that assume a linear propensity. The `assignment_interactions` argument injects multiplicative interaction terms into the propensity model only; the outcome model continues to use the original covariates. The argument accepts a list of integer pairs naming the columns of $X$ whose product enters the propensity:

```{r}
sim_coefs_int <- genCoefs(
  G                               = 3,
  T                               = 4,
  d                               = 2,
  density                         = 0.1,
  eff_size                        = 2,
  assignment_type                 = "multinomial",
  assignment_strength             = 1.0,
  assignment_interactions         = list(c(1, 2)),
  assignment_interaction_strength = 1.0,
  seed                            = 101
)
str(sim_coefs_int$assignment_coefs$interactions)
str(sim_coefs_int$assignment_coefs$delta)
```

Self-interactions are allowed and produce quadratic terms $x_j^2$:

```{r}
sim_coefs_quad <- genCoefs(
  G                       = 3,
  T                       = 4,
  d                       = 2,
  density                 = 0.1,
  eff_size                = 2,
  assignment_type         = "multinomial",
  assignment_interactions = list(c(1, 1)),
  seed                    = 101
)
```

The interaction columns are constructed internally inside the propensity model each time it is evaluated; the simulated `pdata` keeps just the original $d$ covariate columns (the outcome design matrix is unchanged). Pairs are canonicalized to `c(min, max)` and duplicates are silently deduplicated.

## Truth derivation under non-marginal DGPs

Under the default marginal DGP, the overall true ATT is the equal-weighted average of the cohort-specific effects:
$$\tau_\mathrm{ATT} = \frac{1}{G} \sum_{g = 1}^{G} \tau_\mathrm{ATT}(g).$$

Under covariate-dependent DGPs this aggregation must use cohort-share weights instead, matching Faletto (2025) Eq. `att.estimator.weighted` (line 837). `getTes()` automatically computes the propensity-weighted aggregation when the input `FETWFE_coefs` object was built with a non-marginal `assignment_type`:

```{r}
te_mn <- getTes(sim_coefs_mn)
cat("Multinomial-DGP cohort weights:\n")
print(round(te_mn$cohort_weights, 3))
cat("Propensity-weighted overall ATT:", round(te_mn$att_true, 4), "\n")
cat("(Uniform-weighted ATT for comparison:",
    round(mean(te_mn$actual_cohort_tes), 4), ")\n")
```

The `cohort_weights` slot is new in 1.14.0. Under the marginal DGP it is exactly uniform $1/G$; under the multinomial or ordered DGPs it reflects $E[\pi_g(X)] / \sum_{g' \text{ treated}} E[\pi_{g'}(X)]$, estimated by Monte Carlo integration over the covariate distribution.

The per-cohort CATT vector `actual_cohort_tes` is unchanged across DGPs — it is intrinsic to the coefficient vector $\beta$ and does not depend on how cohorts are assigned.

# Conclusion

In this vignette, we walked through how to use the simulation functions in the `{fetwfe}` package to simulate data and run simulations similar to the ones in the simulation studies section of [the FETWFE paper](https://arxiv.org/abs/2312.05985). Version 1.14.0 adds covariate-dependent cohort-assignment DGPs (multinomial-logit and ordered-logit) so Monte Carlo studies can stress-test FETWFE against the realistic regime where cohort selection depends on observable characteristics.

This pipeline streamlines simulation experiments so you can rapidly evaluate FETWFE’s performance under varying scenarios. For more details, consult the package documentation or reach out to the author.