Papers
Topics
Authors
Recent
Search
2000 character limit reached

Empirical Welfare Maximization (EWM)

Updated 22 May 2026
  • Empirical Welfare Maximization (EWM) is a statistical policy learning paradigm that selects treatment rules by maximizing modeled social welfare using empirical process theory.
  • It integrates methods from experimental design, machine learning, and econometrics while leveraging doubly robust, IPW, and plug-in estimators for precise welfare estimation.
  • Its performance is supported by uniform deviation bounds and regret guarantees, ensuring scalable policy fitting even under constraints and in time-series settings.

Empirical Welfare Maximization (EWM) is a statistical policy learning paradigm designed to select among treatment assignment rules with the explicit objective of maximizing population social welfare, usually defined as mean potential outcome under a given policy. EWM formalizes policy learning as a combinatorial optimization problem over a structured class of rules, grounded in principles of empirical process theory and robust causal inference. EWM unifies aspects of experimental design, machine learning classification, and econometric treatment choice; it has become central in the literature on optimal treatment assignment, welfare targeting, and data-driven resource allocation.

1. Formalization of the EWM Principle

Let {(Xi,Di,Yi)}i=1n\{(X_i, D_i, Y_i)\}_{i=1}^n be an observed i.i.d. or sequential sample, where XiX_i are individual covariates, Di∈{0,1}D_i \in \{0,1\} is treatment, and YiY_i is the realized outcome, corresponding to potential outcomes (Yi(0),Yi(1))(Y_i(0), Y_i(1)). A policy π:X→{0,1}\pi:\mathcal{X} \to \{0,1\}, or equivalently a measurable "decision set" G⊆XG \subseteq \mathcal{X}, prescribes treatment assignment based on XX. The population social welfare under π\pi is

W(π)=E[Y1⋅π(X)+Y0⋅(1−π(X))]W(\pi) = \mathbb{E}[Y_1 \cdot \pi(X) + Y_0 \cdot (1 - \pi(X))]

where the expectation is over the joint distribution of XiX_i0. In the presence of unconfoundedness, XiX_i1 can be expressed using doubly robust or inverse-probability-weighted (IPW) estimators.

EWM proceeds by constructing an empirical analogue XiX_i2—via plug-in, IPW, or doubly robust scoring—and selecting

XiX_i3

where XiX_i4 is a user-defined policy class, often of finite VC-dimension (Kato, 30 Oct 2025, Cerulli, 2020, Sun, 2021).

2. Estimation and Computational Methodology

Central to EWM is the empirical welfare criterion. For randomized or unconfounded settings, the canonical doubly robust estimator is

XiX_i5

where XiX_i6 and XiX_i7 are estimated propensity score and outcome regression, respectively (Kato, 30 Oct 2025). Other variants include IPW or plug-in estimators depending on the sampling design and identification strategy (Sun, 2021, Cerulli, 2020).

The EWM optimization is combinatorial: maximizing (sample means of) linear functionals of the form XiX_i8 over XiX_i9, where Di∈{0,1}D_i \in \{0,1\}0 is the feasible class (e.g., threshold sets, decision trees). This is NP-hard for general classes but tractable for low-complexity or one-dimensional threshold rules via grid search (Cerulli, 2020, Crippa, 2024). Recent work establishes an exact equivalence between EWM and least-squares prediction over pseudo-outcomes in a suitable function class, enabling convex relaxation and scalable, regularized policy fitting (Kato, 30 Oct 2025).

3. Statistical Properties and Regret Guarantees

EWM's statistical validity relies on uniform deviation bounds for empirical processes indexed by the policy class. For Di∈{0,1}D_i \in \{0,1\}1 of VC-dimension Di∈{0,1}D_i \in \{0,1\}2, the expected regret satisfies

Di∈{0,1}D_i \in \{0,1\}3

where Di∈{0,1}D_i \in \{0,1\}4 is the welfare-maximizing (oracle) policy in Di∈{0,1}D_i \in \{0,1\}5 (Kato, 30 Oct 2025, Cerulli, 2020). For threshold rules in regular nonparametric models, regret sharpens to Di∈{0,1}D_i \in \{0,1\}6 under additional smoothness and margin conditions (cube-root asymptotics) (Crippa, 2024). In dynamic or time-series settings with mixing or martingale conditions, regret bounds of similar Di∈{0,1}D_i \in \{0,1\}7 form hold under appropriate invariance and exogeneity properties (Kitagawa et al., 2022).

In instrumental variable (IV) models with endogeneity, social welfare is represented as a function of the marginal treatment effect (MTE) kernel: Di∈{0,1}D_i \in \{0,1\}8 EWM is applied by maximizing the empirical analogue of this integral, with the regret rate governed by both the complexity Di∈{0,1}D_i \in \{0,1\}9 of YiY_i0 and the uniform estimation rate of YiY_i1 (Sasaki et al., 2020, Liu, 2022). When MTE is estimated at YiY_i2-rate (parametric or low-dimensional IV), EWM recovers the YiY_i3 regret rate; otherwise, the convergence is limited by the slower MTE estimation rate.

4. Extensions: Constraints, Robustness, and Alternative Welfare Criteria

EWM can incorporate explicit constraints, such as budget or fairness restrictions. With a per-unit cost YiY_i4 and budget YiY_i5, the population-constrained problem is

YiY_i6

where YiY_i7. Empirical analogues directly replace YiY_i8 with sample estimates. However, Naive sample-analogue constrained EWM exhibits failures in uniform feasibility and efficiency: no rule can achieve both asymptotically across all DGPs. Remedies include tightening constraints with critical values (size control) or penalizing constraint violation (trade-off rules) (Sun, 2021, Liu, 2022).

EWM generalizes to alternative social welfare functionals, notably α-Expected Welfare Maximization (α-EWM), which targets the lower-tail mean (CVaR) of the post-treatment outcome distribution over the worst-off α-fraction: YiY_i9 where (Yi(0),Yi(1))(Y_i(0), Y_i(1))0 is the distribution function of (Yi(0),Yi(1))(Y_i(0), Y_i(1))1. Estimation leverages dual representations and cross-fitted, doubly robust scores. Regret analysis reveals a (Yi(0),Yi(1))(Y_i(0), Y_i(1))2 rate, with constants inflating as (Yi(0),Yi(1))(Y_i(0), Y_i(1))3 (Fan et al., 1 May 2025). This framework covers Rawlsian and distributionally robust welfare optimization.

Time-series EWM (T-EWM) adapts the machinery to sequential or nonstationary data. It defines welfare objectives as conditional expectations over policy-induced paths and maximizes empirical IPS-weighted welfare along observed trajectories. Theoretical guarantees extend to martingale and Markov-type processes (Kitagawa et al., 2022).

5. Policy Classes, Threshold Rules, and Implementation Protocols

The choice of policy class (Yi(0),Yi(1))(Y_i(0), Y_i(1))4 fundamentally impacts EWM's empirical behavior and feasibility. Common classes include

  • Threshold rules: Scalar or multivariate policies of the form (Yi(0),Yi(1))(Y_i(0), Y_i(1))5 or Cartesian products of indicator thresholds over selected coordinates. Regret rates and asymptotics are well understood for this class (Crippa, 2024, Cerulli, 2020).
  • Linear scores and finite-depth decision trees: Used for interpretability and tractability.
  • Set-indicator policies over VC-classes: General framework covering most practical applications.

Implementation is feasible with grid-search (for low-dimensional threshold rules), mixed-integer programming (for more complex policies), or, via the equivalence with least-squares, convex optimization for large-scale settings (Kato, 30 Oct 2025). Standard protocol entails (i) estimating individual-level causal effects (e.g., via regression-adjustment, doubly robust estimation, or IV), (ii) evaluating empirical policy-specific welfare over a defined grid or function class, and (iii) selecting the maximizer and reporting welfare and treatment group trade-offs (Cerulli, 2020).

6. Empirical Applications and Illustrations

EWM has been validated in a range of empirical settings:

  • Threshold-based welfare program eligibility using job training (LaLonde) data (Cerulli, 2020), showing welfare gains over random assignment and enabling policy menus parameterized by interpretable thresholds.
  • Medicaid expansion eligibility under budget constraints, where trade-off rules outperform naive constrained EWM in terms of welfare-efficiency and controlled budget violation (Sun, 2021).
  • Optimal tuition subsidy assignment under endogeneity, using estimated MTE in the Indonesian Family Life Survey; EWM (FEWM/BEWM) rules target subpopulations with high predicted gains within budget (Liu, 2022).
  • Dynamic pandemic response policies, where T-EWM estimated adaptive COVID-19 restriction rules with empirical regret improvements confirmed in both simulation and real-world weekly data (Kitagawa et al., 2022).
  • Distributionally robust targeting (α-EWM), shifting treatment to disadvantaged subpopulations, with formal inference on lower-tail welfare (Fan et al., 1 May 2025).

7. Connections to Plug-in Policy Learning and Regularization

EWM and the plug-in approach—assigning treatment to those with positive estimated CATE—are theoretically equivalent under suitable reparameterization (Kato, 30 Oct 2025). Specifically, EWM can be formulated as least squares regression of a pseudo-outcome on the class (Yi(0),Yi(1))(Y_i(0), Y_i(1))6, yielding an exact correspondence between maximizing empirical welfare and minimizing square error within the policy class. This equivalence enables the design of convex, regularized training algorithms, circumventing the NP-hardness of discrete optimization without loss of statistical guarantees. Regularization enhances stability, enables large-scale implementation, and accommodates additional convex constraints (budget, fairness) via joint convex optimization.


References

  • "Welfare Analysis via Marginal Treatment Effects" (Sasaki et al., 2020)
  • "Empirical Welfare Maximization with Constraints" (Sun, 2021)
  • "Policy Learning under Endogeneity Using Instrumental Variables" (Liu, 2022)
  • "Policy Learning with (Yi(0),Yi(1))(Y_i(0), Y_i(1))7-Expected Welfare" (Fan et al., 1 May 2025)
  • "Policy Choice in Time Series by Empirical Welfare Maximization" (Kitagawa et al., 2022)
  • "Regret Analysis in Threshold Policy Design" (Crippa, 2024)
  • "Optimal Policy Learning: From Theory to Practice" (Cerulli, 2020)
  • "Bridging the Gap between Empirical Welfare Maximization and Conditional Average Treatment Effect Estimation in Policy Learning" (Kato, 30 Oct 2025)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Empirical Welfare Maximization (EWM).