Counterfactual Utilities in Decision Theory

Updated 4 July 2026

Counterfactual utilities are functions defined on the full vector of potential outcomes, allowing decision evaluation beyond only realized results.
They distinguish between additive and non-additive loss structures, with identifiability and policy ranking notably affected in multi-treatment settings.
Applications span causal decision theory, welfare economics, random utility models, and fairness, guiding strategic policy learning and risk assessment.

Searching arXiv for papers on counterfactual utilities and closely related frameworks. Counterfactual utilities are utility or loss criteria that evaluate a decision by reference to the full vector of potential outcomes under feasible alternatives, rather than only the realized outcome under the chosen action. In causal decision theory this means replacing standard utilities of the form $u(d;Y(d),X)$ or losses of the form $\ell^{Std}(d;Y(d),X)$ with functions defined on $D \times Y^{D} \times X$ , so that regret, opportunity cost, overtreatment, and “do no harm” asymmetries can enter directly into the objective (Koch et al., 13 May 2025). Across adjacent literatures, the same phrase also appears in revealed-preference welfare analysis, random utility demand, fairness, strategic explanation design, and axiomatic decision theory, where the common theme is that counterfactual reasoning enlarges the outcome domain on which utilities are defined or inferred (Chambers et al., 2021).

1. Conceptual domain and principal formulations

The modern causal formulation begins from the Neyman–Rubin model with treatments $D \in \mathcal{D}=\{0,\dots,K-1\}$ , potential outcomes $Y(d)$ , and a decision rule $\pi(X)$ . Classical statistical decision theory evaluates a rule through the risk of a standard loss,

$R(\pi;\ell^{Std}) = E\!\big[\ell^{Std}(\pi(X);Y(\pi(X)),X)\big],$

where the loss depends only on the chosen treatment and its realized outcome. Counterfactual utilities replace this with a loss or utility defined on the entire potential-outcome vector,

$\ell(d;y,x), \qquad y=(y_0,\dots,y_{K-1})\in \mathcal{Y}^K,$

and the associated counterfactual risk

$R(D^\ast;\ell)=E\big[\ell(D^\ast;Y(0),\dots,Y(K-1),X)\big].$

This evaluates a decision “in light of” all feasible outcomes, not only the realized one (Koch et al., 13 May 2025).

A canonical example is regret, where loss compares the realized outcome under the chosen action to the best attainable outcome among all actions. The same idea appears in overtreatment penalties, principal-strata utilities, and asymmetric harm criteria. The literature therefore uses counterfactual utilities to encode distinctions that standard outcome-based utility suppresses, such as whether a more invasive treatment was unnecessary, whether an intervention harmed someone who would otherwise have done well, or whether a missed treatment opportunity was less serious than active harm (Koch et al., 13 May 2025).

A related but not identical usage appears in empirical welfare economics. There, the object is not a known causal utility over potential outcomes, but a set of utilities consistent with observed choice data. Counterfactual bundle comparisons, welfare tests, and possible Pareto efficiency are then conducted by working with the entire feasible utility class $\mathcal{U}_i(D_i)$ or with the empirical partial orders implied by it. In that sense, “counterfactual utilities” can refer either to utilities defined on all potential outcomes or to utilities that are not observed directly but are constrained by past behavior and then used to evaluate unobserved bundles or allocations (Chambers et al., 2021).

Strand	Core object	Representative paper
Causal decision theory	$\ell^{Std}(d;Y(d),X)$ 0	(Koch et al., 13 May 2025)
Axiomatic expected utility	Preferences on $\ell^{Std}(d;Y(d),X)$ 1	(Koch et al., 6 May 2026)
Revealed-preference welfare	Utility sets consistent with data	(Chambers et al., 2021)
Random utility demand	Bounds on counterfactual demand/welfare	(Kitamura et al., 2019)

This breadth suggests that the central unifying feature is not a single mathematical form, but an enlargement of the evaluative domain beyond realized outcomes.

2. Identification, additivity, and treatment choice

The fundamental difficulty is identification. Only one potential outcome is observed for each unit, so counterfactual risk generally depends on the unobserved joint distribution of $\ell^{Std}(d;Y(d),X)$ 2. Under i.i.d. sampling, consistency, and strong ignorability for the observed treatment, the decisive result is that risk differences between decision systems are identifiable if and only if the loss is additive in the potential outcomes: $\ell^{Std}(d;Y(d),X)$ 3 where each $\ell^{Std}(d;Y(d),X)$ 4 depends only on the $\ell^{Std}(d;Y(d),X)$ 5-th potential outcome and $\ell^{Std}(d;Y(d),X)$ 6 is independent of the decision $\ell^{Std}(d;Y(d),X)$ 7. In this case, the unknown interaction term contributes only a decision-independent constant to risk, so risk differences remain identified; exact identification of risk requires $\ell^{Std}(d;Y(d),X)$ 8 (Koch et al., 13 May 2025).

The same paper proves a sharp boundary between binary and multi-valued treatment. When $\ell^{Std}(d;Y(d),X)$ 9, every additive counterfactual loss is equivalent, up to a decision-independent constant, to a standard loss depending only on the realized outcome $D \times Y^{D} \times X$ 0. Consequently, additive counterfactual utilities do not change the ranking of binary-treatment policies. For $D \times Y^{D} \times X$ 1, if at least one genuinely counterfactual weight $D \times Y^{D} \times X$ 2 depends on a potential outcome for $D \times Y^{D} \times X$ 3, no standard loss can reproduce the same policy ranking. This is precisely where counterfactual utilities become decision-theoretically distinct rather than interpretive relabelings (Koch et al., 13 May 2025).

Several concrete losses illustrate the distinction. A baseline-potential-outcome loss penalizing false negatives and false positives relative to $D \times Y^{D} \times X$ 4, and a two-potential-outcome loss of the form

$D \times Y^{D} \times X$ 5

are additive and therefore identifiable under strong ignorability. By contrast, asymmetric principal-strata losses intended to operationalize the Hippocratic idea that harming someone is worse than failing to save someone are generally non-additive and hence not point identified from standard observational data alone (Koch et al., 13 May 2025).

This identification logic is extended directly in policy learning with asymmetric counterfactual utilities. There the utility of treatment depends on principal strata $D \times Y^{D} \times X$ 6, and the expected utility loss of a policy relative to another involves the unidentified principal score $D \times Y^{D} \times X$ 7. The resulting expected utility is therefore only partially identified. The proposed response is minimax decision-making: derive sharp Fréchet-type bounds on the unidentified principal-strata terms and choose policies that minimize worst-case expected utility loss or worst-case regret relative to a benchmark policy (Ben-Michael et al., 2022).

The same additive structure underlies triage scores. A triage score is defined as a function $D \times Y^{D} \times X$ 8 that summarizes the conditional distribution of joint potential outcomes, and expected utility for decision $D \times Y^{D} \times X$ 9 is

$D \in \mathcal{D}=\{0,\dots,K-1\}$ 0

Risk scores emerge as a degenerate case that depends only on a baseline potential outcome and ignores the rest of the joint vector. Under additivity, expected counterfactual utility is identifiable and estimable; without additivity, it is not (Imai et al., 19 Jun 2026).

3. Axiomatic foundations and the relation to standard utility

An explicit axiomatic foundation is given by extending the von Neumann–Morgenstern framework from realized outcomes to the full potential-outcome space

$D \in \mathcal{D}=\{0,\dots,K-1\}$ 1

A counterfactual utility is then a function $D \in \mathcal{D}=\{0,\dots,K-1\}$ 2, and policies induce distributions in $D \in \mathcal{D}=\{0,\dots,K-1\}$ 3. On this extended domain, the standard vNM axioms—completeness, transitivity, independence, and continuity—are imposed directly on preferences over $D \in \mathcal{D}=\{0,\dots,K-1\}$ 4. The central representation theorem states that these axioms are equivalent to expected counterfactual utility representation, with uniqueness up to positive affine transformation (Koch et al., 6 May 2026).

This result is designed to answer the claim that counterfactual utilities are inherently incoherent or intransitive. The paper argues that apparent violations arise when utilities defined on the extended potential-outcome space are projected back onto the realized-outcome space $D \in \mathcal{D}=\{0,\dots,K-1\}$ 5. Two kinds of projection are analyzed. Under a menu-dependent projection, different menus can induce different pairwise comparisons and even cycles on realized lotteries. Under a context-dependent projection with a fixed global set of acts, revealed preference on the finite act set remains complete and transitive, although menu or context dependence may still appear at the realized-outcome level (Koch et al., 6 May 2026).

The framework is then used to reinterpret two canonical examples. First, the Russian roulette example in the statistics literature is treated as an instance of an asymmetric counterfactual utility that heavily weights avoiding active harm. Second, Bell’s regret-theoretic utility

$D \in \mathcal{D}=\{0,\dots,K-1\}$ 6

is embedded in the counterfactual-utility framework, and the Allais ranking is reproduced without violating vNM on $D \in \mathcal{D}=\{0,\dots,K-1\}$ 7. What fails on the realized-outcome space is not coherence on the extended domain, but the attempt to interpret projected, menu-sensitive behavior as if it were generated by a single standard utility on $D \in \mathcal{D}=\{0,\dots,K-1\}$ 8 (Koch et al., 6 May 2026).

Two additional axioms characterize reductions of counterfactual utility. “Irrelevance of Counterfactual Outcomes” yields standard utilities that depend only on $D \in \mathcal{D}=\{0,\dots,K-1\}$ 9, while “Irrelevance of Counterfactual Correlation” yields additive counterfactual utilities of the form

$Y(d)$ 0

The latter result is especially important because additive counterfactual utilities are exactly the class for which expectations depend only on marginal potential-outcome distributions and hence satisfy the necessary and sufficient condition for point identification established in the causal identification literature (Koch et al., 6 May 2026).

4. Counterfactual utilities in economics, demand, and choice

In empirical welfare economics, observed choice data replace known utility functions. A dataset $Y(d)$ 1 is rationalizable if there exists an increasing utility $Y(d)$ 2 such that each observed bundle solves a utility-maximization problem on the corresponding budget, and by Afriat’s theorem GARP implies rationalizability by a monotone concave utility. The paper constructs “besting” and “strict besting” relations by convexifying revealed preference, and proves that $Y(d)$ 3 strictly bests $Y(d)$ 4 if and only if $Y(d)$ 5 for all concave, increasing rationalizations. At the social level, an allocation is possibly Pareto efficient if and only if it is not empirically dominated by any feasible alternative (Chambers et al., 2021).

This revealed-preference perspective turns counterfactual utility into a set-valued object. Counterfactual bundle comparisons are point-identified only when the empirical partial order is strong enough; otherwise rankings remain set-valued across feasible utilities. The same logic supports Kaldor comparisons and bounds on Debreu’s coefficient of resource utilization: with only choice data, one obtains bounds over all rationalizing utilities rather than a single welfare number (Chambers et al., 2021).

In nonparametric random utility demand, counterfactual budgets are analyzed by representing observed stochastic demand as a mixture over finitely many rational choice types. Adding a new budget $Y(d)$ 6 yields an augmented rational demand matrix $Y(d)$ 7, and feasible counterfactual patch-choice probabilities form a convex polytope. Sharp bounds on counterfactual discrete-choice probabilities, on $Y(d)$ 8 for bounded functionals $Y(d)$ 9, and on c.d.f.s of $\pi(X)$ 0 are obtained by linear programming. This gives a nonparametric revealed-preference account of counterfactual demand and welfare, including money-metric functionals, without imposing parametric structure on utility heterogeneity (Kitamura et al., 2019).

A further extension studies aggregate combinatorial choice over binary polytopes. There, observed aggregate probabilities $\pi(X)$ 1 across environments are rationalized by a representative-agent model with separable concave utility

$\pi(X)$ 2

A complete characterization of rationalizable datasets reduces to a polynomial-size linear program, while counterfactual prediction for new combinatorial environments is performed by robust optimization over the set of separable representative-agent models consistent with the data. The resulting worst-case and best-case predictions can be interpreted as bounds on counterfactual utilities or outcomes under new feasible sets (Ruan et al., 29 May 2025).

In models with exogenous consideration, a different boundary appears. The paper shows that classic ARUM, extended ARUM with $\pi(X)$ 3 utilities for infeasible alternatives, and consideration-set ARUM are observationally equivalent when utility differences are bounded. They therefore share the same counterfactual bounds and welfare formulas for utility-index interventions such as price changes. By contrast, attention interventions are not point identified in this nonparametric framework: if an option may fail to be considered, the welfare effect of forcing attention to it is $\pi(X)$ 4, while it collapses to $\pi(X)$ 5 only when the data imply the option is always considered (Allen, 2024).

5. Policy learning, fairness, triage, and strategic response

Policy learning with asymmetric counterfactual utilities makes principal strata the basic evaluative unit. With binary treatment and binary potential outcomes, the decision-maker can assign different gains and losses to useful treatment, harmful treatment, harmless treatment, and useless treatment. When the loss from harmful treatment exceeds the gain from useful treatment, the model operationalizes “do no harm.” Because the expected utility depends on unidentified principal scores, the paper develops minimax loss and minimax regret rules, shows how they can be learned by solving intermediate weighted classification problems, and derives finite-sample excess expected utility loss bounds in terms of the regret of those classifiers (Ben-Michael et al., 2022).

Triage scores generalize this idea into a practical decision instrument. A conventional risk score predicts the probability of an undesirable outcome under a reference decision, typically no intervention. A triage score instead summarizes the conditional distribution of principal strata and evaluates decisions using additive counterfactual utilities that include both realized and counterfactual components. In a pretrial setting, the framework distinguishes safe, backlash, preventable, and hopeless strata, incorporates costs of detention and crime, and shows that triage-based policies can be substantively different from risk-score policies. The paper also gives an identification theorem for expected additive counterfactual utility and proposes AIPW-style estimators and tree-based policy learning (Imai et al., 19 Jun 2026).

Fairness through counterfactual utilities extends the same logic to group fairness. In the Fairness Decision-Making Problem, each algorithm $\pi(X)$ 6 induces an individual welfare random variable $\pi(X)$ 7 and a decision-maker cost $\pi(X)$ 8. Welfare Demographic Parity requires

$\pi(X)$ 9

while Counterfactual Utility Equal Opportunity conditions on a qualification indicator

$R(\pi;\ell^{Std}) = E\!\big[\ell^{Std}(\pi(X);Y(\pi(X)),X)\big],$ 0

By defining qualification through counterfactual attainability rather than observed outcomes, the framework is designed to block self-fulfilling prophecy loopholes in settings such as recidivism prediction, where the decision itself changes the outcome (Blandin et al., 2021).

Counterfactual explanations introduce yet another use of the term. In a strategic classification setting, a decision-maker chooses both a policy $R(\pi;\ell^{Std}) = E\!\big[\ell^{Std}(\pi(X);Y(\pi(X)),X)\big],$ 1 and a set of counterfactual explanations $R(\pi;\ell^{Std}) = E\!\big[\ell^{Std}(\pi(X);Y(\pi(X)),X)\big],$ 2, anticipating that individuals may change their features to improve their prospects. Utility is computed under the induced post-adaptation distribution $R(\pi;\ell^{Std}) = E\!\big[\ell^{Std}(\pi(X);Y(\pi(X)),X)\big],$ 3: $R(\pi;\ell^{Std}) = E\!\big[\ell^{Std}(\pi(X);Y(\pi(X)),X)\big],$ 4 The explanation-selection problem is NP-hard, but the resulting set function is nonnegative, monotone, and submodular for fixed policy, so a greedy algorithm yields approximation guarantees. Joint optimization over policies and explanations reduces to non-monotone submodular maximization, and partition matroid constraints can be used to enforce diversity of explanations across groups (Tsirtsis et al., 2020).

6. Counterfactual worlds, uncertainty, and open directions

One unresolved issue is the semantics of the counterfactual itself. “Counterfactuals for the Future” distinguishes interventional treatment choice from forward-looking counterfactual treatment choice. In the forward-looking version, the sample is the actual population of interest and unit-specific exogenous variables are assumed to exhibit stability and structure over time. Welfare is then evaluated under a distribution of future outcomes for the same units conditional on their observed data,

$R(\pi;\ell^{Std}) = E\!\big[\ell^{Std}(\pi(X);Y(\pi(X)),X)\big],$ 5

rather than under the interventional distribution for a fresh population. The paper shows that mismatches between these semantics can change the welfare ranking of policies and can reverse conclusions about variance and inequality effects (Bynum et al., 2022).

A complementary line emphasizes model uncertainty. In uncertain structural causal models, counterfactual distributions are not identified by observational and interventional data alone. A hierarchical Bayesian approach based on Bayesian Warped Gaussian Processes therefore represents uncertainty over both structural functions and the noise-effect mapping, yielding full posterior counterfactual distributions rather than point predictions. This makes it possible to define expected-utility, variance-penalized, chance-constrained, or model-robust counterfactual utilities for interventions such as algorithmic recourse (Weilbach et al., 2023).

At the most abstract level, counterfactual spaces separate counterfactuality from intervention. A counterfactual probability space is a probability space on a product of world-specific measurable spaces; a counterfactual causal space adds causal kernels and a no-cross-world-causal-effect axiom. Shared information between worlds is encoded in the joint probability measure and causal kernels, with independence and synchronisation as two extremes. This framework permits utilities to be defined directly on tuples of worlds, not only on realized outcomes or intervention-indexed potential outcomes (Park et al., 1 Jan 2026).

In empirical games, uncertainty is amplified by partially identified parameters, multiple equilibria, and randomized strategies. The counterfactual predictive distribution set formalizes this by producing a set of distributions over sets of outcomes, where outcomes may be behavioral or welfare quantities. The paper proves sharpness of the population CPDS and posterior consistency under continuity and posterior consistency of the underlying identified set. This extends counterfactual utility analysis to strategic environments where the counterfactual itself is set-valued even before sampling uncertainty is added (Kline et al., 2024).

Several limitations recur across the literature. Non-additive counterfactual utilities are generally not point identified from standard observational data under strong ignorability, because they depend on the joint distribution of multiple potential outcomes (Koch et al., 13 May 2025). Utility specification is explicitly normative: choosing overtreatment penalties, harm asymmetries, or detention costs is not a statistical question (Imai et al., 19 Jun 2026). In revealed-preference and random utility settings, finite data produce large identified sets, so welfare comparisons remain partial rather than pointwise (Chambers et al., 2021). Support conditions also matter: in limited-consideration models, identification of consideration probabilities and attention-intervention welfare depends sharply on bounded versus unbounded variation in utility indices (Allen, 2024). These constraints suggest that future work will continue to move along two margins already visible in the literature: stronger structural assumptions for sharper identification, and richer decision criteria that preserve the expressive power of genuinely counterfactual utility.