Additive Counterfactual Utilities Overview

Updated 4 July 2026

Additive counterfactual utilities are defined by decomposing a decision’s overall utility into separate, summative components, which makes counterfactual analysis tractable across different models.
They underpin diverse methodologies—from additive random utility models and counterfactual loss in statistical decision theory to submodular strategic explanation frameworks—ensuring identifiability and practical inference.
The framework reduces complex counterfactual settings into estimable functions of observable marginals or gradients, leading to robust predictions, welfare bounds, and improved policy evaluation.

to=arxiv.search 天天爱彩票是json {"query":"Additive Counterfactual Utilities arXiv (Allen, 2024, Koch et al., 13 May 2025, Koch et al., 6 May 2026, Imai et al., 19 Jun 2026, Tsirtsis et al., 2020, Ruan et al., 29 May 2025, Albini et al., 2021, Allen et al., 2020, Chambers et al., 2021)","max_results":10,"sort_by":"relevance"} Additive counterfactual utilities are utility specifications in which the evaluation of a decision, intervention, or counterfactual environment decomposes additively across counterfactual components. In recent arXiv work, the phrase appears in several technically distinct literatures: additive random utility models for discrete choice, counterfactual-loss formulations in statistical decision theory, strategic-response models for explanations and recourse, separable representative-agent models for combinatorial choice, and counterfactual feature-attribution methods. Across these settings, additivity is used to make counterfactual choice, welfare, or policy comparison tractable from observables that typically identify only marginals rather than full joint structures (Allen, 2024, Koch et al., 13 May 2025, Koch et al., 6 May 2026, Imai et al., 19 Jun 2026).

1. Domain-specific meanings of additivity

The term does not denote a single formalism. Instead, it names a recurring structural restriction: the utility-relevant contribution of each counterfactual component enters as a sum rather than through unrestricted interactions. In discrete choice, the additive object is the utility index plus an additive shock, and counterfactual analysis concerns changes in observable shifters such as prices. In statistical decision theory, the additive object is a counterfactual loss or utility over the full vector of potential outcomes. In strategic explanation models, additivity refers to aggregation across individuals’ post-adaptation utilities. In separable combinatorial choice, it refers to item-wise utility terms under polyhedral feasibility constraints. In CF-SHAP, it is an additive decomposition of model output relative to a counterfactual background (Allen, 2024, Koch et al., 13 May 2025, Tsirtsis et al., 2020, Ruan et al., 29 May 2025, Albini et al., 2021).

Setting	Additive object	Principal implication
Discrete choice	$U_i'=v_i(x')+\epsilon_i$	Price/shifter counterfactuals and welfare can coincide across ARUM variants on bounded support
Potential outcomes	$\ell^{Add}(d;y,x)=\sum_{k\in A}\omega_k(d,y_k,x)+\varpi(y,x)$	Counterfactual risk differences are identifiable iff the loss is additive
Strategic explanations	$U(\pi,E)=\sum_{i=1}^n u_i(\pi,E)$	Explanation design becomes submodular optimization
Combinatorial choice	$u(x)=\sum_{j=1}^n u_j(x_j)$	Counterfactual prediction can be bounded by LP/MILP/MICQP formulations
Counterfactual attribution	$f(x)=\phi_0^{CF}(x)+\sum_i \phi_i^{CF}(f,x)$	Feature attributions become local “gap-to-threshold” decompositions

A common misconception is that all of these uses refer to the same identification problem. They do not. What is shared is a technical discipline: additivity suppresses decision-dependent interactions that would otherwise require unobservable joint counterfactual structure.

2. Additive counterfactual utilities in random utility models

In the additive random utility framework, counterfactual utilities are the utilities that obtain after a change in observable shifters $x \to x'$ :

$U_i' = v_i(x') + \epsilon_i,$

with the qualification that under the extended model $U_i'=-\infty$ for infeasible alternatives. The relevant comparison in "Exogenous Consideration and Extended Random Utility" is among classic ARUM, extended ARUM with infeasible alternatives (ARUM-E), and consideration-set ARUM (ARUM-CS) (Allen, 2024).

The central observational result is that ARUM-E and ARUM-CS are observationally equivalent on any domain of utility indices $U$ , and that ARUM, ARUM-E, and ARUM-CS are observationally equivalent when $U$ has bounded utility differences. The constructive mappings are explicit: infeasible alternatives in ARUM-E can be encoded as unconsidered alternatives in ARUM-CS, and limited consideration in ARUM-CS can be mimicked in ARUM by assigning sufficiently low finite shocks to excluded alternatives when utility differences are bounded. This equivalence is not merely observational. The paper states that counterfactual choice bounds and welfare formulas for changes in utility shifters such as price are identical across the three models on bounded support (Allen, 2024).

The welfare object is average indirect utility. Under the paper’s normalization,

$\ell^{Add}(d;y,x)=\sum_{k\in A}\omega_k(d,y_k,x)+\varpi(y,x)$ 0

On the identified domain, the envelope theorem yields

$\ell^{Add}(d;y,x)=\sum_{k\in A}\omega_k(d,y_k,x)+\varpi(y,x)$ 1

and welfare changes satisfy the path-integral formula

$\ell^{Add}(d;y,x)=\sum_{k\in A}\omega_k(d,y_k,x)+\varpi(y,x)$ 2

for the straight-line path $\ell^{Add}(d;y,x)=\sum_{k\in A}\omega_k(d,y_k,x)+\varpi(y,x)$ 3. This is the formal basis for the claim that price/shifter counterfactuals are robust to exogenous limited consideration on bounded support (Allen, 2024).

The same paper sharply separates price interventions from attention interventions. Under full consideration, attention interventions cannot change welfare. Under limited consideration, the welfare effect of a $\ell^{Add}(d;y,x)=\sum_{k\in A}\omega_k(d,y_k,x)+\varpi(y,x)$ 4-attention intervention has identified set $\ell^{Add}(d;y,x)=\sum_{k\in A}\omega_k(d,y_k,x)+\varpi(y,x)$ 5. The intuition given is that when $\ell^{Add}(d;y,x)=\sum_{k\in A}\omega_k(d,y_k,x)+\varpi(y,x)$ 6 is not considered, its latent $\ell^{Add}(d;y,x)=\sum_{k\in A}\omega_k(d,y_k,x)+\varpi(y,x)$ 7 can be arbitrarily high, so adding $\ell^{Add}(d;y,x)=\sum_{k\in A}\omega_k(d,y_k,x)+\varpi(y,x)$ 8 to the consideration set can generate arbitrarily large gains in the maximized utility. Identification of marginal consideration probabilities shows a related discontinuity: on bounded support, only bounds such as $\ell^{Add}(d;y,x)=\sum_{k\in A}\omega_k(d,y_k,x)+\varpi(y,x)$ 9 are available, whereas at truly unbounded support the identified set collapses to the point $U(\pi,E)=\sum_{i=1}^n u_i(\pi,E)$ 0. The paper emphasizes that identification “towards” infinity does not resemble identification “at” infinity (Allen, 2024).

3. Potential outcomes, identifiability, and axiomatic foundations

In statistical decision theory, additive counterfactual utilities are the negatives of additive counterfactual losses defined on the entire vector of potential outcomes. For treatments $U(\pi,E)=\sum_{i=1}^n u_i(\pi,E)$ 1 and potential outcomes $U(\pi,E)=\sum_{i=1}^n u_i(\pi,E)$ 2, the additive loss class is

$U(\pi,E)=\sum_{i=1}^n u_i(\pi,E)$ 3

and the corresponding additive counterfactual utility is

$U(\pi,E)=\sum_{i=1}^n u_i(\pi,E)$ 4

Under IID sampling, consistency, and strong ignorability, "Statistical Decision Theory with Counterfactual Loss" proves that the difference in counterfactual risk between any pair of decision-making systems is identifiable if and only if the loss is additive in this sense, and that exact identification holds if and only if $U(\pi,E)=\sum_{i=1}^n u_i(\pi,E)$ 5 (Koch et al., 13 May 2025).

The identified component depends only on observable marginals:

$U(\pi,E)=\sum_{i=1}^n u_i(\pi,E)$ 6

This is why additivity is necessary. Once decision-dependent interactions across multiple potential outcomes appear, the risk depends on the joint law of unobserved potential outcomes and becomes unidentifiable from standard observed data. The same paper also distinguishes the binary from the multi-treatment case. For $U(\pi,E)=\sum_{i=1}^n u_i(\pi,E)$ 7, any additive counterfactual loss has an equivalent standard loss that yields the same treatment recommendations up to an $U(\pi,E)=\sum_{i=1}^n u_i(\pi,E)$ 8-dependent constant. For $U(\pi,E)=\sum_{i=1}^n u_i(\pi,E)$ 9, additive counterfactual losses can produce treatment recommendations that standard observed-outcome losses cannot replicate (Koch et al., 13 May 2025).

"An Axiomatic Foundation for Decisions with Counterfactual Utility" places this identification result inside an extended von Neumann–Morgenstern framework. The extended state space is

$u(x)=\sum_{j=1}^n u_j(x_j)$ 0

a policy induces

$u(x)=\sum_{j=1}^n u_j(x_j)$ 1

and a counterfactual utility is a function $u(x)=\sum_{j=1}^n u_j(x_j)$ 2. The paper proves that expected counterfactual utility satisfies the vNM axioms on $u(x)=\sum_{j=1}^n u_j(x_j)$ 3 and represents preferences through

$u(x)=\sum_{j=1}^n u_j(x_j)$ 4

Additivity is then characterized axiomatically by “Irrelevance of Counterfactual Correlation,” yielding the representation

$u(x)=\sum_{j=1}^n u_j(x_j)$ 5

The same paper states that this additive class is exactly the one for which $u(x)=\sum_{j=1}^n u_j(x_j)$ 6 is point-identified for non-oracle policies for every state of nature $u(x)=\sum_{j=1}^n u_j(x_j)$ 7 (Koch et al., 6 May 2026).

The triage-score framework takes this potential-outcome view into policy evaluation. In "Triage Score: A Counterfactual Risk Assessment Instrument," a triage score is a summary of the conditional distribution over principal strata,

$u(x)=\sum_{j=1}^n u_j(x_j)$ 8

and under additive utilities the unrestricted pointwise optimal rule is

$u(x)=\sum_{j=1}^n u_j(x_j)$ 9

Risk scores appear as a special case in which utilities depend only on a baseline potential outcome $f(x)=\phi_0^{CF}(x)+\sum_i \phi_i^{CF}(f,x)$ 0, so the sufficient statistic collapses to $f(x)=\phi_0^{CF}(x)+\sum_i \phi_i^{CF}(f,x)$ 1. The paper states that additive counterfactual utilities are necessary and sufficient for point identification of expected utility under unconfoundedness, and it uses this structure to compare human-alone, human-with-AI, AI-alone, and learned policies in a pretrial RCT (Imai et al., 19 Jun 2026).

4. Strategic behavior, explanations, and algorithmic decompositions

A distinct use of additive counterfactual utility appears in strategic-response models for explanations. In "Decisions, Counterfactual Explanations and Strategic Behavior," individuals can move from an initial feature value $f(x)=\phi_0^{CF}(x)+\sum_i \phi_i^{CF}(f,x)$ 2 to a new feature value guided by a published explanation set $f(x)=\phi_0^{CF}(x)+\sum_i \phi_i^{CF}(f,x)$ 3, incurring a cost $f(x)=\phi_0^{CF}(x)+\sum_i \phi_i^{CF}(f,x)$ 4. The decision maker’s utility is

$f(x)=\phi_0^{CF}(x)+\sum_i \phi_i^{CF}(f,x)$ 5

and in the finite-population representation it decomposes additively as

$f(x)=\phi_0^{CF}(x)+\sum_i \phi_i^{CF}(f,x)$ 6

This additivity is the mechanism behind the set-function structure of explanation selection (Tsirtsis et al., 2020).

For a fixed outcome-monotone policy $f(x)=\phi_0^{CF}(x)+\sum_i \phi_i^{CF}(f,x)$ 7, the paper proves that selecting at most $f(x)=\phi_0^{CF}(x)+\sum_i \phi_i^{CF}(f,x)$ 8 explanations to maximize utility is NP-hard, but the objective $f(x)=\phi_0^{CF}(x)+\sum_i \phi_i^{CF}(f,x)$ 9 is nonnegative, monotone, and submodular. Consequently, the standard greedy algorithm attains a $x \to x'$ 0 approximation. When policy and explanations are optimized jointly, the induced objective $x \to x'$ 1 remains nonnegative and submodular but is generally non-monotone; a randomized algorithm for non-monotone submodular maximization then gives a $x \to x'$ 2 approximation. The paper further introduces partition matroid constraints to enforce diversity across subpopulations, with continuous-greedy giving $x \to x'$ 3 under general matroids and simple greedy giving $x \to x'$ 4 under a partition matroid (Tsirtsis et al., 2020).

Counterfactual SHAP uses additivity in yet another sense. "Counterfactual Shapley Additive Explanations" defines a user-specific counterfactual background distribution $x \to x'$ 5 and then applies interventional Shapley values with

$x \to x'$ 6

The resulting decomposition is

$x \to x'$ 7

where

$x \to x'$ 8

The paper interprets this as an additive counterfactual utility decomposition of the model output relative to a counterfactual baseline, and introduces “derived trends”

$x \to x'$ 9

to indicate directions of change (Albini et al., 2021).

This SHAP-based usage is not a welfare criterion in the econometric sense. It is instead a local decomposition of a predictive score. The paper nonetheless connects it to recourse through the “counterfactual-ability” metric, defined as the negative of the minimal cost needed to reach a decision-flipping point along an action set implied by $U_i' = v_i(x') + \epsilon_i,$ 0. Empirically, CF-SHAP is reported to outperform input-invariant backgrounds in counterfactual-ability and plausibility on HELOC, Lending Club, and Wine Quality using tree ensembles (Albini et al., 2021).

5. Separable utilities in combinatorial choice, revealed preference, and approximate welfare

In combinatorial environments, additivity typically means separability across items rather than across treatments or alternatives. "Going from a Representative Agent to Counterfactuals in Combinatorial Choice" models aggregate marginal inclusion probabilities $U_i' = v_i(x') + \epsilon_i,$ 1 as optimal choices of a representative agent maximizing

$U_i' = v_i(x') + \epsilon_i,$ 2

over a binary polytope

$U_i' = v_i(x') + \epsilon_i,$ 3

The paper gives an exact characterization of representable cross-environment probabilities in terms of consistent ordering of dual scores $U_i' = v_i(x') + \epsilon_i,$ 4, shows that consistency can be checked by a polynomial-size LP, and derives nonparametric counterfactual prediction bounds in new environments through robust optimization and a mixed-integer linear reformulation. When observed data are inconsistent with the separable model, a compact MICQP yields a best-fitting approximation (Ruan et al., 29 May 2025).

Revealed-preference welfare analysis uses additivity more cautiously. "Empirical Welfare Economics" does not directly treat additive separability across goods in its core theorem, but it does characterize when a candidate allocation is Pareto efficient for some rationalizing utilities and shows that, for rationalizable datasets, a candidate allocation is efficient for some monotone increasing, explicitly quasiconcave rationalizations if and only if it is not empirically dominated. Within the synthesis accompanying that paper, additive/separable constraints can be imposed on Afriat multipliers through equalities at identical quantities and monotonicity across quantities,

$U_i' = v_i(x') + \epsilon_i,$ 5

thereby shrinking the feasible set of rationalizations and tightening counterfactual utility and welfare comparisons (Chambers et al., 2021).

A further variant arises in quasilinear welfare analysis. "Counterfactual and Welfare Analysis with an Approximate Model" adopts the additive form

$U_i' = v_i(x') + \epsilon_i,$ 6

and allows small, explicitly bounded deviations from exact optimization. The minimal rationalization wedge $U_i' = v_i(x') + \epsilon_i,$ 7 is defined through Afriat-type inequalities with slack,

$U_i' = v_i(x') + \epsilon_i,$ 8

and the paper assumes that approximation error at counterfactual choices is of the same magnitude as in the observed data. Under this discipline, counterfactual quantity sets, utility differences between bundles, and welfare differences between price vectors are all bounded by LPs. The counterfactual quantity set is closed and convex, upper bounds are finite only when the counterfactual price lies in the interior of the upper comprehensive convex hull of observed prices, and the welfare bound is convex, weakly decreasing, and lower semicontinuous in price (Allen et al., 2020).

These literatures share a structural idea: separability or quasilinearity makes counterfactual inference portable across environments because feasibility or budgets do the coupling, while the utility itself is item-wise or money-additive.

6. Limits, controversies, and open distinctions

Additivity is powerful, but its meaning and legitimacy depend on the domain. A central warning comes from "Remarks on Utility in Repeated Bets," which argues that utility cannot be additive across multiple bets in the sense of summing single-stage vNM utilities, except in the special case of linear utility. In simultaneous repeated bets, vNM requires evaluating the joint compound lottery over the total payoff using a utility defined on the total prize range; in sequential bets, the correct object is conditional utility,

$U_i' = v_i(x') + \epsilon_i,$ 9

which depends on previously realized rewards unless $U_i'=-\infty$ 0 is linear. The paper therefore rejects additive summation across stages or across unrealized branches of a decision tree as a general principle (Megiddo, 2023).

This objection does not contradict the additive counterfactual utility results in potential-outcome decision theory. It marks a different domain. Additivity across treatments or potential outcomes is a condition for identification under partial observation; additivity across stages in repeated monetary gambles is a claim about dynamic aggregation of realized wealth. Conflating the two is a common source of confusion. This suggests that “additivity” should always be read relative to the primitive space on which utility is defined (Megiddo, 2023, Koch et al., 13 May 2025).

A second source of confusion concerns coherence and transitivity. The axiomatic literature shows that expected counterfactual utility is coherent on the extended space $U_i'=-\infty$ 1, but that projections to realized-outcome space can behave differently. Under menu-dependent projection, revealed preferences over lotteries can violate WARP and can be intransitive; under context-dependent projection, revealed preferences over the displayed lotteries are complete and transitive but can depend on the ambient menu and need not satisfy vNM independence on $U_i'=-\infty$ 2. The Russian roulette example and the Allais paradox are presented as cases in which apparent inconsistency disappears once preferences are defined on the potential-outcome space rather than solely on realized outcomes (Koch et al., 6 May 2026).

Empirically, the strongest limitations are support and ignorability conditions. In the random-utility setting, equivalence across ARUM, ARUM-E, and ARUM-CS requires bounded utility differences, and endogenous consideration would break the exogeneity condition that consideration probabilities do not depend on utility indices or shifters (Allen, 2024). In potential-outcome settings, strong ignorability, overlap, or decision unconfoundedness are substantive assumptions; without them, additive structure alone does not rescue identification (Koch et al., 13 May 2025, Imai et al., 19 Jun 2026).

The cumulative lesson is not that additivity is universally correct. It is that additivity is the minimal structure that, in several otherwise unrelated literatures, converts counterfactual objects from unobservable joint constructions into estimable or computable functions of observable marginals, gradients, or per-unit contributions. Where that reduction is substantively defensible, additive counterfactual utilities support sharp welfare formulas, exact identification results, submodular optimization, and tractable counterfactual prediction. Where it is not, the resulting counterfactual utilities can be misleading or dynamically incoherent.