Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash 99 tok/s
Gemini 2.5 Pro 43 tok/s Pro
GPT-5 Medium 28 tok/s
GPT-5 High 35 tok/s Pro
GPT-4o 94 tok/s
GPT OSS 120B 476 tok/s Pro
Kimi K2 190 tok/s Pro
2000 character limit reached

Sample-Level Estimands: Definitions, Variance, and Inference

Updated 24 August 2025
  • Sample-Level Estimands are defined as functions of unit-level potential outcomes that capture treatment effects within the observed sample.
  • They enable refined variance estimation and prediction intervals, improving inferential accuracy in settings with heterogeneity and complex designs.
  • Applications span cluster trials, handling of intercurrent events, and Bayesian computation, with software tools facilitating practical implementation.

Sample-level estimands are parameters defined at the scale of the observed sample, typically in randomized trials or observational studies. They quantify causal contrasts—most commonly treatment effects—either at the individual or aggregate sample level. Unlike population-level estimands, which target averages over a superpopulation (such as PATE or CATE), sample-level estimands are functions of the outcomes (often counterfactual) for each sampled unit. Their definition, identification, inference, and interpretation vary across paper design and analytic framework. The following sections elaborate on their formal definitions, variance properties, handling in complex designs, relation to regression estimands, implications for inference, and computational practice.

1. Formal Definitions and Classes

Sample-level estimands are typically defined as functions of unit-level potential outcomes for the n subjects observed in a paper. The most canonical forms include:

  • Sample Average Treatment Effect (SATE):

SATE=1Ni=1N[Yi(1)Yi(0)]\mathrm{SATE} = \frac{1}{N} \sum_{i=1}^N [Y_i(1) - Y_i(0)] where Yi(1)Y_i(1) and Yi(0)Y_i(0) are the potential outcomes for unit ii under treatment and control, respectively.

  • Sample Average Treatment Effect on the Treated (SATT):

SATT=1mi=1N[Yi(1)Yi(0)]Ti\mathrm{SATT} = \frac{1}{m} \sum_{i=1}^N [Y_i(1) - Y_i(0)] \cdot T_i with TiT_i the treatment indicator; m=i=1NTim = \sum_{i=1}^N T_i is the total number of treated units.

  • Sample Average Treatment Effect on the Controls (SATC):

SATC=1Nmi=1N[Yi(1)Yi(0)](1Ti)\mathrm{SATC} = \frac{1}{N-m} \sum_{i=1}^N [Y_i(1) - Y_i(0)] \cdot (1-T_i)

These estimands depend on the realized treatment assignment in a randomized trial, rendering SATT and SATC random (over assignment) whereas SATE is fixed for the sample.

Generalizations include linear mixtures:

ωSATT+(1ω)SATC\omega\,\mathrm{SATT} + (1-\omega)\,\mathrm{SATC}

where SATE is obtained with ω=p\omega = p (pp the treatment proportion).

In observational studies, sample-level estimands such as ATT, ATU, and ATO are defined in terms of observed treatment group membership and overlap weights (Greifer et al., 2021).

2. Variance Properties and Inference

A critical advance (Sekhon et al., 2017) is the derivation of non-conservative variance formulas for sample-level estimands. For the difference-in-means estimator t=Y1Y0t = \overline{Y}_1 - \overline{Y}_0:

  • When centered on SATE:

Var(tSATE)=1Np(1p)[p2σ02+(1p)2σ12+2p(1p)ρσ0σ1]\mathrm{Var}(t - \mathrm{SATE}) = \frac{1}{N p (1-p)} [p^2 \sigma_0^2 + (1-p)^2 \sigma_1^2 + 2 p (1-p) \rho \sigma_0 \sigma_1]

with p=m/Np = m/N, σj2\sigma_j^2 the sample variance for treatment jj, and ρ\rho the correlation between Yi(0)Y_i(0) and Yi(1)Y_i(1).

  • Centered on SATT:

Var(tSATT)=1Np(1p)σ02\mathrm{Var}(t - \mathrm{SATT}) = \frac{1}{N p (1-p)} \sigma_0^2

  • Centered on SATC:

Var(tSATC)=1Np(1p)σ12\mathrm{Var}(t - \mathrm{SATC}) = \frac{1}{N p (1-p)} \sigma_1^2

Prediction intervals for SATT and SATC avoid conservatism, since they do not require bounding the unobservable ρ\rho. The optimal mixture (SATO) minimizes variance with:

ω=(σ1/σ0)2ρ(σ1/σ0)(σ1/σ0)2+12ρ(σ1/σ0)\omega^* = \frac{(\sigma_1/\sigma_0)^2 - \rho(\sigma_1/\sigma_0)}{(\sigma_1/\sigma_0)^2 + 1 - 2\rho(\sigma_1/\sigma_0)}

These refined variance expressions reveal that the interval width, and hence inferential accuracy, are sensitive to finite-sample heterogeneity, group proportions, and outcome variance structures.

3. Sample-Level Estimands in Complex Designs

Cluster Randomized Trials and Selection Bias

In cluster trials with post-randomization selection, distinct sample-level estimands exist for the overall population and the recruited subpopulation (Li et al., 2021). Given principal stratification for recruitment, one has:

  • Overall population ATE:

τ=spsτs\tau = \sum_s p_s \tau_s

  • Recruited population ATE:

τR=papa+pcτa+pcpa+pcτc\tau_R = \frac{p_a}{p_a + p_c} \tau_a + \frac{p_c}{p_a + p_c} \tau_c

(pap_a, pcp_c denote stratum membership proportions.) Inferences using recruited samples may be biased if treatment effects differ across principal strata or if recruitment probabilities are not balanced.

Interference and Spillovers

In settings with strong interference (e.g. social networks, vaccine studies), new sample-level estimands contrast direct and indirect exposure effects via "attributable effects" (Choi, 2021):

  • For unit ii:

Ai=YiθiA_i = Y_i - \theta_i

(θi\theta_i the counterfactual under uniformity.) Contrasts such as

τ1=1N1Xi=1Ai1N0Xi=0Ai\tau_1 = \frac{1}{N_1} \sum_{X_i=1} A_i - \frac{1}{N_0} \sum_{X_i=0} A_i

provide lower bounds on the number of units affected. These estimands are identified under randomization alone, with wider prediction intervals.

4. Estimands and Regression Frameworks

Linear regression estimands in the presence of treatment effect heterogeneity are sample-level contrasts but can be misinterpreted (Słoczyński, 2018). The OLS coefficient for treatment, τ\tau, is:

τ=w1τAPLE,1+w0τAPLE,0\tau = w_1 \cdot \tau_{\text{APLE},1} + w_0 \cdot \tau_{\text{APLE},0}

where τAPLE,g\tau_{\text{APLE},g} are group-specific linear projection parameters, and

w1=(1ρ)Var(p(X)d=0)ρVar(p(X)d=1)+(1ρ)Var(p(X)d=0)w_1 = \frac{(1-\rho) \mathrm{Var}(p(X)|d=0)}{\rho \mathrm{Var}(p(X)|d=1) + (1-\rho) \mathrm{Var}(p(X)|d=0)}

OLS weights can be "reversed" compared to natural group proportions, strongly skewing the effective estimand toward the less common group in the sample. Diagnostics for weight alignment (e.g. δ=ρw1\delta = \rho - w_1) are essential for interpreting regression-based treatment effects.

5. Handling Intercurrent Events and Real-World Evidence

In clinical trials, the presence of intercurrent events (ICEs) imposes the need for estimands that reflect policy-relevant contrasts. The ICH E9 (R1) classification (treatment policy, hypothetical, composite, while-on-treatment, principal stratum) is often insufficient—the hybrid estimand (Qu et al., 2020):

μh=E{[Y1(1S1)+(Y0+δ)S1]Y0}\mu_h = E\{ [Y_1 (1 - S_1) + (Y_0 + \delta) S_1 ] - Y_0 \}

mixes null effects for patients with safety-related ICEs and hypothetical effects for others. This approach is parameterized by the ICE indicator S1S_1 and adapts both imputation and analytic strategies.

In real-world evidence (RWE) studies, sample-level estimands must be constructed with reference to the observed population, accommodating heterogeneity of sample composition, complex treatment regimes, and multiple ICEs (Chen et al., 2023). The selection of attributes—population, treatment, endpoint, ICE strategy, summary measure—is crucial for valid inference.

6. Bayesian Computation and Identification

In Bayesian frameworks, sample-level estimands (ITE, SATE) are functions of both observed data and missing counterfactuals (Oganisian, 20 Aug 2025). Identification requires explicit cross-world assumptions (e.g., f(Y(0),Y(1)L)=f(Y(0)L)f(Y(1)L)f(Y(0), Y(1)|L) = f(Y(0)|L)f(Y(1)|L)). Computation proceeds by posterior imputation of missing potential outcomes, typically via MCMC. The marginal posterior for sample-level estimands is derived from the full posterior over parameters and missing data:

f(yM,ωDo)f(Dc;ω)f(ω)f(\mathbf{y}^M,\omega\,|\,D^o) \propto f(D^c;\omega)f(\omega)

Stan code must declare missing counterfactuals as parameters. Errors commonly stem from conflating sample-level with population-level estimands, leading to faulty inference and misinterpretation.

7. Empirical Performance and Software

Monte Carlo simulations (Sekhon et al., 2017, Wang et al., 19 Feb 2025) and large-scale online experiments confirm that interval estimation for SATT/SATC (or analogous overlapping population estimands, e.g., ATO) delivers improved coverage and efficiency in scenarios with substantial group variance or heterogeneity. Empirical routines (e.g., R package estCI) implement analytic formulas for standard errors and prediction intervals, facilitating application at scale. In integrated RCT–external control designs, balancing weights derived from propensity scores for data source membership (Wang et al., 19 Feb 2025) define estimands over target populations (ATI, ATT, ATO), with simulation showing the ATO estimator possesses minimized bias and variance under poor overlap.

Summary Table: Core Sample-Level Estimands

Estimand Definition / Formula Key Setting
SATE (1/N)i=1N[Yi(1)Yi(0)](1/N) \sum_{i=1}^N [Y_i(1) - Y_i(0)] RCT or finite sample
SATT (1/m)i=1N[Yi(1)Yi(0)]Ti, m=Ti(1/m)\sum_{i=1}^{N}[Y_i(1) - Y_i(0)] T_i,\ m = \sum T_i RCT, observed treated
SATC (1/(Nm))i=1N[Yi(1)Yi(0)](1Ti)(1/(N-m))\sum_{i=1}^N [Y_i(1) - Y_i(0)](1-T_i) RCT, observed controls
ATO (Overlap) Weighted average in overlap region RCT + EC, observational
Hybrid estimand (ICEs) μh=E{[Y1(1S1)+(Y0+δ)S1]Y0}\mu_h = E\{[Y_1(1 - S_1) + (Y_0 + \delta)S_1] - Y_0\} Clinical trials, ICEs
ITE Yi(1)Yi(0)Y_i(1) - Y_i(0) Bayesian, individual level

The definition, inference, and computation of sample-level estimands require explicit attention to assignment mechanism, population structure, heterogeneity, ICEs, and potential outcomes modeling. Simulation and empirical studies highlight the efficiency and coverage benefits of estimand selection tailored to the sample characteristics and paper design. Analytical routines and diagnostic tools are now broadly available to correctly specify, estimate, and interpret these parameters in applied research.