Causal Fairness Evaluation

Updated 15 April 2026

Causal fairness evaluation is a method that employs structural causal models and counterfactual reasoning to distinguish direct, indirect, and path-specific effects in biased decision-making systems.
It leverages techniques like back-door adjustment, inverse probability weighting, and doubly robust estimation to quantify fairness criteria under various causal assumptions.
Applications across healthcare, hiring, and policy-making enable legal auditing and targeted algorithmic interventions by precisely decomposing causal effects.

Causal fairness evaluation is the systematic analysis of fairness notions in decision-making and machine learning systems using the formal language and tools of causal inference. Rather than relying solely on associative, group-level parity metrics, causal fairness draws on structural causal models (SCMs), interventions (do-operator), and counterfactual reasoning to isolate, quantify, and decompose the precise mechanisms by which protected attributes (e.g., gender, race) influence outcomes. This approach enables robust identification of direct, indirect, and path-specific effects, supports actionable algorithmic interventions, and underpins both technical and legal standards for fair decision-making.

1. Causal Fairness Notions and Formal Definitions

Causal fairness evaluation establishes fairness criteria using structural causal models or the potential outcomes framework. Key definitions include:

Total Effect / Average Causal Effect (ACE): Quantifies the difference in outcome under hypothetical interventions on the sensitive attribute:

$ACE(Y,A) = P(Y=1 \mid do(A=1)) - P(Y=1 \mid do(A=0))$

This captures the overall causal impact of the protected attribute on the decision, disentangling underlying associations from true effect (Binkyte et al., 2022, Makhlouf et al., 2020).

Natural Direct Effect (NDE) and Natural Indirect Effect (NIE): Partition the total effect into a direct path (not mediated by specified variables) and an indirect path (fully mediated by a set of mediators, e.g., qualifications):

$NDE = P(Y_{A \leftarrow 1,\,\mathbf{Z} \leftarrow \mathbf{Z}_{A \leftarrow 0}}=1) - P(Y_{A \leftarrow 0}=1)$

$NIE = P(Y_{A \leftarrow 0,\,\mathbf{Z} \leftarrow \mathbf{Z}_{A \leftarrow 1}}=1) - P(Y_{A \leftarrow 0}=1)$

These decompositions ground disparate treatment vs. disparate impact assessments (Binkyte et al., 2022, Plecko et al., 2022, Makhlouf et al., 2020, Yu et al., 24 Mar 2026).

Path-Specific Effect (PSE): Isolates the effect along a given subset $\pi$ of causal paths:

$PSE(Y,A,\pi) = P(Y_{A \leftarrow 1 \text{ on } \pi,\,A \leftarrow 0 \text{ on } \overline{\pi}}=1) - P(Y_{A \leftarrow 0}=1)$

Permits fine-grained separation of justifiable (business-necessity) vs. illegitimate (proxy/redlining) effects (Makhlouf et al., 2020, Chiappa et al., 2019, Yu et al., 24 Mar 2026).

Counterfactual/Individual Fairness: A decision is counterfactually fair if, for each individual, the predicted outcome is invariant to a counterfactual change in the sensitive attribute:

$P(\hat{Y}_{A \leftarrow a}(U) = y \mid X=x, A=a) = P(\hat{Y}_{A \leftarrow a'}(U) = y \mid X=x, A=a)$

(Makhlouf et al., 2020, Madras et al., 2018, Ehyaei et al., 2023).

Interventional Fairness: A predictor or dataset is fair with respect to (S, A) if for all (s, s', a, y),

$P[Y=y\mid do(S=s), do(A=a)] = P[Y=y\mid do(S=s'), do(A=a)]$

(Galhotra et al., 2020).

These criteria map directly onto technical and legal mandates for auditing discriminatory effects.

2. Identification and Estimation of Causal Fairness Criteria

Identifiability is a prerequisite for causal fairness evaluation. Under the Markovian (no hidden confounders) and faithfulness conditions, total, direct, indirect, and path-specific effects are typically identified by the back-door or front-door adjustment—thus grounded estimands can be rewritten in terms of observable quantities (Makhlouf et al., 2020, Binkyte et al., 2022, Binkytė-Sadauskienė et al., 2022):

Back-door adjustment:

$P(Y=y\mid do(A=a)) = \sum_{z} P(Y=y\mid A=a, Z=z) P(Z=z)$

Estimation methods:
- Inverse Probability Weighting (IPW): Weights samples by group assignment probabilities (Ogura et al., 2020, Makhlouf et al., 2020).
- Doubly Robust (DR) Estimation: Combines modeling the outcome with propensity score-based reweighting for improved robustness (Ogura et al., 2020, Plecko et al., 2022).
- Confounded Settings: Identification may require more advanced methods (ID algorithms, do-calculus) or sensitivity analysis if hidden confounders cannot be ruled out (Binkyte et al., 2022, Fawkes et al., 2024, Nagesh et al., 16 Mar 2026).
Empirical workflow: Fit estimators for nuisance parameters (e.g., propensity scores, regression models for covariates, mediators, outcomes) prior to counterfactual or path-specific estimation. These steps are implemented in doubly robust cross-fitting pipelines for high-dimensional or flexible-function settings (Plecko et al., 2022, Yu et al., 24 Mar 2026, Ogura et al., 2020).

3. Algorithmic and Practical Frameworks

Numerous algorithmic pipelines have been formulated for practical causal fairness evaluation:

SeqSel/GrpSel algorithms select maximally informative non-admissible features by conditional independence (CI) tests alone, targeting interventional fairness. Features are retained only if they do not mediate unfair causal flow from the sensitive attribute to the outcome, or if any such flow is blocked by admissible covariate adjustment.

Causal Testing and Auditing

Distributional Closeness Testing (CF-CLOT): Instead of scalar causal effects, tests closeness of full factual and interventional potential-outcome distributions using kernel methods (e.g., normalized MMD) with theoretical consistency guarantees (Fu et al., 18 Feb 2025).
Causal Fair Metric Learning: Causal distance functions, trained via deep metric learning, provide robust metrics for counterfactual fairness and for adversarial robustness, coordinating individual fairness, causality, and adversarial notions (Ehyaei et al., 2023).
Long-term/Sequential Fairness Decomposition: In dynamic policies, causal analysis partitions group-qualification gain into direct, delayed, and spurious components, guiding both short-term and long-term fairness interventions (Lear et al., 12 Jun 2025).

Path-Specific Auditing and Policy Implications

Path-Specific Effect Auditing: Applied to large, structured datasets (e.g., PopResume), PSE audits separate permissible business-necessity effects from impermissible proxy/redlining effects (Yu et al., 24 Mar 2026, Nagesh et al., 16 Mar 2026).
Dataset Reweighting: Causal DAGs inform adversarial data reweighting mechanisms to achieve targeted fairness constraints (total, path-specific, or counterfactual), balancing data utility and fairness objectives (Zhao et al., 2023).

4. Evaluation Metrics and Empirical Case Studies

Metrics for empirical causal fairness evaluation include:

Notion	Formula	Scope
Average Causal Effect	$P(Y=1\|do(A=1)) - P(Y=1\|do(A=0))$	Population
Natural Direct Effect	$P(Y_{A \leftarrow 1, Z \leftarrow Z_{A \leftarrow 0}}=1) - P(Y_{A \leftarrow 0}=1)$	Path-specific
Path-Specific Effect	$NDE = P(Y_{A \leftarrow 1,\,\mathbf{Z} \leftarrow \mathbf{Z}_{A \leftarrow 0}}=1) - P(Y_{A \leftarrow 0}=1)$ 0	Arbitrary subset of paths
Counterfactual Fairness	$NDE = P(Y_{A \leftarrow 1,\,\mathbf{Z} \leftarrow \mathbf{Z}_{A \leftarrow 0}}=1) - P(Y_{A \leftarrow 0}=1)$ 1	Individual

Applications include:

Healthcare: Evaluating causal discovery algorithms using path-specific decomposition, defining clinical utility-fairness ratios to prioritize actionable pathways (Nagesh et al., 16 Mar 2026).
Hiring: Path-specific effect-based audits in LLM/VLM screening reveal discrimination not visible via outcome-level metrics; audit pipelines provide granular accountability for legal compliance (Yu et al., 24 Mar 2026).
Text Models: Causal and statistical debiasing are non-interchangeable; simultaneous application achieves joint mitigation of group and counterfactual bias (Chen et al., 2024).
Dynamic Policies: Decomposition of qualification gains into direct, delayed, and spurious effects guides the selection of optimal policy regularizers (Lear et al., 12 Jun 2025).

5. Challenges, Limitations, and Sensitivity Analysis

Several fundamental challenges affect causal fairness evaluation:

Causal Graph Discovery: SCM structure significantly impacts fairness analysis; choice of discovery algorithm (PC, FCI, GES, LiNGAM) and domain knowledge critically influence all downstream conclusions. Reporting fairness uncertainty across equivalence classes is essential (Binkytė-Sadauskienė et al., 2022, Nagesh et al., 16 Mar 2026).
Assumptions: Faithfulness, no unmeasured confounding, and positivity (support overlap) are required for identification. Violations necessitate bounds or robust sensitivity analysis (Fawkes et al., 2024, Binkyte et al., 2022).
Sensitivity to Measurement and Selection Bias: Parity metrics can become fragile even with mild proxy-label or selection bias. Causal sensitivity analysis computes worst-case bounds for metrics under plausible violation parameterizations, exposing the limits of fairness claims (Fawkes et al., 2024).
Intervenability Constraints: For immutable attributes, fairness evaluation is most conceptually coherent when interventions are defined on perception or observable proxies, not biological categories (Rahmattalabi et al., 2022).
Computational Complexity: High-dimensional CI testing, kernel-based closeness tests, and adversarial reweighting can be computationally intensive and require parameter tuning (Galhotra et al., 2020, Fu et al., 18 Feb 2025, Zhao et al., 2023).

6. Institutional, Legal, and Policy Integration

Causal fairness evaluation offers a direct framework for regulatory, legal, and compliance action:

Legal Doctrines: Causal path-specific decomposition aligns with legal standards distinguishing business necessity (lawful) from proxy/redlining (unlawful) under statutes such as Title VII (Binkyte et al., 2022, Yu et al., 24 Mar 2026, Chiappa et al., 2019).
Standardized Causal Audit Reports: Regulatory reporting should include explicit causal diagrams, effect estimates (with uncertainty), and path-classifications (Binkyte et al., 2022).
Remediation and Recourse: When path-specific effects violate policy, options include targeted classifier adjustment (causal constraints), data interventions (pre-processing/reweighting), or societal action shifting the data-generating mechanisms themselves (Kügelgen et al., 2020).
Limitations: Remaining issues include legal access to model internals; standardization of causal audit workflows; and scalability to high-dimensional, dynamic, or feedback-prone systems (Binkyte et al., 2022).

7. Future Directions and Open Problems

Open research questions in causal fairness evaluation include:

Graph-robust Fairness Estimation: Reliable fairness reporting over graph uncertainty and systematic integration of domain expertise in causal structure learning (Binkytė-Sadauskienė et al., 2022, Nagesh et al., 16 Mar 2026).
Finite-sample Guarantees: Statistical rates or generalization bounds for kernel-based distributional closeness tests and causal metric learning frameworks (Ehyaei et al., 2023, Fu et al., 18 Feb 2025).
Extension to Sequential and Dynamic Decision-Making: General frameworks for long-term, policy-dependent fairness in reinforcement learning and control settings (Lear et al., 12 Jun 2025).
Automated, Scalable Pipelines: Efficient, scalable algorithms for joint feature selection, fairness estimation, and sensitivity analysis suitable for industry-scale deployments (Galhotra et al., 2020, Zhao et al., 2023, Fawkes et al., 2024).
Legal and Interdisciplinary Training: Developing shared technical-legal fluency for model auditors, regulators, and courts as causal audit requirements become statutory (Binkyte et al., 2022).

Causal fairness evaluation, by deploying tools of do-calculus, counterfactual inference, and path-specific decomposition, offers a principled, robust, and actionable paradigm for ensuring nondiscriminatory AI and automated decisions across domains.