Quasi-Optimal Adjustment Set in Causal Inference

Updated 27 December 2025

Quasi-optimal adjustment sets are collections of covariates designed to nearly minimize variance in causal effect estimation while ensuring valid identifiability.
The methodology uses d-separation, enumerates valid adjustment sets, and applies ε or δ tolerance parameters to assess trade-offs in high-dimensional settings.
Empirical studies demonstrate that quasi-optimal sets achieve up to 20% variance reduction and lower sample complexity compared to complete valid adjustment sets.

A quasi-optimal adjustment set is a collection of covariates employed in covariate adjustment for causal effect estimation that—by construction or design—minimizes (or nearly minimizes) the variance of the estimator among all valid adjustment sets, subject to identifiability and practical considerations. This concept enables practitioners to balance statistical efficiency with the measurement or computational cost, while maintaining identification of the causal estimand in both finite- and high-dimensional or longitudinal settings. The quasi-optimality notion generalizes full optimality by permitting tolerance parameters (additive or multiplicative variance gaps) and approximate adjustment conditions when either finite-sample, computational, or model complexity considerations preclude oracle-optimality.

1. Adjustment Sets and Optimality: Foundational Concepts

A valid adjustment set $\mathbf{Z}$ for estimating the causal effect of a treatment $X$ on an outcome $Y$ in a causal DAG (or more generally, a structured graphical model) must satisfy:

No element of $\mathbf{Z}$ is a descendant of $X$ along a causal path to $Y$ ;
$\mathbf{Z}$ blocks (d-separates) all non-causal (back-door) paths from $X$ to $Y$ .

The classic adjustment formula asserts:

$\mathbb{E}[Y \mid \text{do}(X=x)] = \sum_{z} \mathbb{E}[Y \mid X = x, \mathbf{Z}=z] P(\mathbf{Z}=z)$

Optimal adjustment seeks a valid $\mathbf{Z}$ such that the variance of the causal effect estimator is minimized among all valid adjustment sets. In both linear models and semiparametric settings, this unique set minimizes the asymptotic or PAC (Probably Approximately Correct) error, and, in high-dimensional settings, obviates the exponential sample complexity imposed by unnecessary high-cardinality adjustments (Henckel et al., 2019, Choo et al., 2024, Rotnitzky et al., 2019, Adenyo et al., 2024).

2. Formal Definition and Characterization of Quasi-Optimal Adjustment Sets

Quasi-optimality formalizes near-minimal variance subject to either a multiplicative or additive tolerance. Given $\mathbf{O}$ , the (oracle) optimal adjustment set, a set $\mathbf{Q}$ is $\varepsilon$ -quasi-optimal if

$\operatorname{Var}[\hat\chi_{a, \mathbf{Q}}] \leq (1+\varepsilon)\operatorname{Var}[\hat\chi_{a, \mathbf{O}}]$

or $\delta$ -quasi-optimal if

$\operatorname{Var}[\hat\chi_{a, \mathbf{Q}}] - \operatorname{Var}[\hat\chi_{a, \mathbf{O}}] \leq \delta$

where $\hat\chi_{a, \mathbf{Q}}$ is the estimator of the mean under intervention $A=a$ adjusting for $\mathbf{Q}$ (Rotnitzky et al., 2019).

In summary causal graphs (SCGs) or high-dimensional settings where the structure is only partially known or data are discrete, the “quasi-optimal” set is the smallest valid adjustment (according to a generalized back-door criterion) that in some compatible fine timescale DAG attains the optimal variance, and in every compatible DAG contains the non-descendant part of the DAG-optimal set (Belciug et al., 20 Dec 2025).

3. Algorithms and Graphical Criteria for Construction

DAG and CPDAG Models

The optimal adjustment set in a DAG is characterized graphically by

$\mathbf{O}(A,Y,\mathcal{G}) = \operatorname{pa}_{\mathcal{G}}(\mathsf{cn}(A,Y,\mathcal{G})) \setminus \mathsf{forb}(A,Y,\mathcal{G})$

where $\mathsf{cn}(A,Y,\mathcal{G})$ are the mediators from $A$ to $Y$ and $\mathsf{forb}$ are their descendants and $A$ itself (Henckel et al., 2019, Rotnitzky et al., 2019).

To construct (quasi-)optimal sets:

Enumerate all valid sets using back-door and d-separation criteria.
Evaluate the asymptotic variance for regression, IPW, or AIPW estimators.
Select a minimal set or those satisfying user-specified quasi-optimality tolerances (Rotnitzky et al., 2019, Luo et al., 2024).

Constraint-Based and PAC Approaches

For high-dimensional and discrete distributions, the ε-Markov blanket approach approximates the adjustment set by identifying $S$ satisfying

$\Delta_{X \perp (A \setminus S)\mid S} \leq \varepsilon$

and bounds the corresponding bias by $O(\varepsilon/\alpha_S)$ where $\alpha_S$ is a positivity constant (Choo et al., 2024). Two algorithmic approaches are proposed:

AMBA: Exhaustively enumerates subsets $S$ and selects the smallest passing a conditional-independence test for $X$ and $A \setminus S$ given $S$ .
BAMBA: Further shrinks $S$ to a set $S'$ satisfying screening-set conditions involving approximate independence for both treatment and outcome, subject to error bounds.

The combined PAC bound asserts that, with enough samples and careful misspecification control, adjusting on a quasi-optimal subset achieves accuracy close to direct adjustment on the large (oracle) valid set, with drastically reduced sample complexity (Choo et al., 2024).

Time-Dependent/LONGITUDINAL and Abstracted Graphs

For time-dependent interventions or SCGs,

One exploits new inclusion/exclusion lemmas leveraging d-separation on expanded blocks of variables (“Q-nodes”), constructing a sufficient set via greedy forward/backward moves.
The quasi-optimal set is the terminal set of such an expansion, provably attaining minimum achievable variance relative to all reachable sufficient sets by such graphical moves (Adenyo et al., 2024).

In SCGs, the quasi-optimal adjustment set is computed by constructing parent sets of extended causal nodes, excluding possible descendants of the treatment, tailored to the specific identifiability scenario (Belciug et al., 20 Dec 2025).

4. Variance Comparison, Theoretical Guarantees, and PAC Bounds

Comparison criteria are graphically characterized. For two valid sets $\mathbf{S}$ and $\mathbf{T}$ , $\operatorname{Var}[\hat\chi_{a,\mathbf{S}}] \leq \operatorname{Var}[\hat\chi_{a, \mathbf{T}}]$ if and only if

$A \perp_{\mathcal{G}} (\mathbf{S} \setminus \mathbf{T}) \mid \mathbf{T}$
$Y \perp_{\mathcal{G}} (\mathbf{T} \setminus \mathbf{S}) \mid A, \mathbf{S}$

This yields both constructive pruning procedures and certificates of (quasi-)optimality (Rotnitzky et al., 2019, Henckel et al., 2019).

In finite/high-dimensional settings, PAC bounds quantify estimation error, showing that the sample complexity is exponential in the size of the adjustment used, but can be dramatically reduced by using (quasi-)optimal Markov blankets or screened subsets, at the cost of a controlled bias (Choo et al., 2024).

5. Connections to Minimal, Sufficient, and Exhaustive Sets

Enumeration of all sufficient adjustment sets enables principled selection of quasi-optimal sets via explicit trade-offs:

Minimal cardinality or sparsity
Asymptotic variance (or plug-in variance estimate)
Practical constraints (measurement or computational resource demands)

By enumerating all sufficient sets (e.g., all $A \subset \{1, ..., p\}$ satisfying $Y(t) \perp A \mid X_A$ ), one can extract all adjustment sets, then select among them those that are quasi-optimal under the stated variance criterion (Luo et al., 2024).

6. Practical Implications and Applied Examples

Empirical results from simulation and real-data studies confirm that quasi-optimal adjustment sets systematically yield lower or comparable standard errors compared to naively larger valid sets, without incurring bias or relying on perfect causal structure recovery. In high-dimensional situations, this may mean using just a sparse subset (e.g., an approximate Markov blanket rather than the full valid set), leading to substantial reductions in sample complexity and estimator variance (Choo et al., 2024). In time-varying or SCG settings, quasi-optimal sets can reduce variance by 10–20% over previous graph-based criteria (Belciug et al., 20 Dec 2025, Adenyo et al., 2024).

Model Class	Adjustment Set Criterion	Quasi-Optimality Features
Classical DAGs	Parents of mediators $\setminus$ forbids	Minimal variance, proven optimality
High-dimensional discrete	$\epsilon$ -Markov blankets, screening sets	Controlled bias, reduced dimensionality
Time-dependent/SCGs	Back-door in summary graph, parent sets	Compatible with all fine-scale DAGs

7. Theoretical and Methodological Extensions

Quasi-optimal adjustment set theory continues to extend to:

CPDAGs, maximal PDAGs, and summary graphs, allowing for ambiguity and latent summary structure (Belciug et al., 20 Dec 2025, Henckel et al., 2019);
Nonparametric models: all key results generalize from linear to nonparametric settings using influence function calculus and plug-in variance estimates (Rotnitzky et al., 2019);
Exhaustive enumeration and selection-based practical workflows, including open-source code, for large but feasible $p$ (Luo et al., 2024);
Constraint-based and data-driven selection under finite-sample PAC error bounds and misspecification robustness (Choo et al., 2024).

Altogether, the notion of quasi-optimal adjustment sets enables rigorous, sample- and graph-aware selection of adjustment covariates with guarantees for identifiability, efficiency, and scalability in modern causal inference (Choo et al., 2024, Belciug et al., 20 Dec 2025, Rotnitzky et al., 2019, Adenyo et al., 2024, Henckel et al., 2019, Luo et al., 2024).

PDF Markdown Chat (Pro)

References (6)

Graphical Criteria for Efficient Total Effect Estimation via Adjustment in Causal Linear Models (2019)

Probably approximately correct high-dimensional causal effect estimation given a valid adjustment set (2024)

Efficient adjustment sets for population average treatment effect estimation in non-parametric causal graphical models (2019)

Longitudinal efficient adjustment sets for time-varying treatment effect estimation in nonparametric causal graphical models (2024)

On Efficient Adjustment in Causal Graphs (2025)

An exhaustive selection of sufficient adjustment sets for causal inference (2024)

Whiteboard

Generate a whiteboard explanation of this topic.

Topic to Video (Beta)

Generate a video overview of this topic.

Follow Topic

Get notified by email when new papers are published related to Quasi-Optimal Adjustment Set.

Quasi-Optimal Adjustment Set in Causal Inference

1. Adjustment Sets and Optimality: Foundational Concepts

2. Formal Definition and Characterization of Quasi-Optimal Adjustment Sets

3. Algorithms and Graphical Criteria for Construction

DAG and CPDAG Models

Constraint-Based and PAC Approaches

Time-Dependent/LONGITUDINAL and Abstracted Graphs

4. Variance Comparison, Theoretical Guarantees, and PAC Bounds

5. Connections to Minimal, Sufficient, and Exhaustive Sets

6. Practical Implications and Applied Examples

7. Theoretical and Methodological Extensions

Whiteboard

Topic to Video (Beta)

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

Quasi-Optimal Adjustment Set in Causal Inference

1. Adjustment Sets and Optimality: Foundational Concepts

2. Formal Definition and Characterization of Quasi-Optimal Adjustment Sets

3. Algorithms and Graphical Criteria for Construction

DAG and CPDAG Models

Constraint-Based and PAC Approaches

Time-Dependent/LONGITUDINAL and Abstracted Graphs

4. Variance Comparison, Theoretical Guarantees, and PAC Bounds

5. Connections to Minimal, Sufficient, and Exhaustive Sets

6. Practical Implications and Applied Examples

7. Theoretical and Methodological Extensions

Sponsor

Whiteboard

Topic to Video (Beta)

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research