Wasserstein Distributionally Robust Rare-Event Simulation

Published 4 Jan 2026 in stat.ME, q-fin.CP, and stat.CO | (2601.01642v1)

Abstract: Standard rare-event simulation techniques require exact distributional specifications, which limits their effectiveness in the presence of distributional uncertainty. To address this, we develop a novel framework for estimating rare-event probabilities subject to such distributional model risk. Specifically, we focus on computing worst-case rare-event probabilities, defined as a distributionally robust bound against a Wasserstein ambiguity set centered at a specific nominal distribution. By exploiting a dual characterization of this bound, we propose Distributionally Robust Importance Sampling (DRIS), a computationally tractable methodology designed to substantially reduce the variance associated with estimating the dual components. The proposed method is simple to implement and requires low sampling costs. Most importantly, it achieves vanishing relative error, the strongest efficiency guarantee that is notoriously difficult to establish in rare-event simulation. Our numerical studies confirm the superior performance of DRIS against existing benchmarks.

Abstract PDF Chat (Pro)

Summary

The paper introduces a distributionally robust simulation framework that computes worst-case rare-event probabilities using a 2-Wasserstein ball formulation.
It leverages duality to convert an intractable infinite-dimensional problem into estimating the probability of an inflated rare-event region via an adaptively determined scalar parameter.
The proposed DRIS algorithm employs importance sampling with provable vanishing relative error, delivering significant variance reduction and computational efficiency.

Wasserstein Distributionally Robust Rare-Event Simulation: Theory and Algorithms

Motivation and Problem Formulation

Rare-event probability estimation is fundamental across fields such as quantitative finance, risk management, and engineering. Classical rare-event simulation algorithms—e.g., importance sampling (IS) and splitting—typically assume full specification of the underlying probability model. However, in practical applications, the data-driven nature of risk assessment often introduces model ambiguity, where the true data-generating process deviates from the nominal model. The paper introduces a formulation for rare-event simulation under distributional uncertainty, specifically quantifying the worst-case rare-event probability over a 2-Wasserstein ball surrounding a nominal distribution and targeting convex rare-event sets. This addresses robustness against model misspecification, which standard simulation approaches inadequately capture.

Let $\mathcal{P}_n$ denote the set of all probability measures on $\mathbb{R}^n$ . For a nominal distribution $P_0$ and radius $\delta > 0$ , the 2-Wasserstein ball $\mathcal{U}_\delta(P_0)$ is defined as the set of distributions $P$ such that $W_2(P_0, P) \leq \delta$ , with $W_2$ being the standard 2-Wasserstein metric. The quantity of interest is:

$p_* = \sup_{P \in \mathcal{U}_\delta(P_0)} P(\mathcal{E}),$

where $\mathcal{E}$ is a closed, convex, rare-event region (e.g., portfolios breaching loss thresholds under a Gaussian model). The work is primarily instantiated for $P_0$ standard normal but extends to elliptical families.

Dual Reformulation and Theoretical Foundation

Direct computation of $p_*$ is intractable due to infinite-dimensionality. The authors invoke duality results in Wasserstein distributionally robust optimization (DRO), particularly those for inner problems as in Blanchet and Murthy (2019) [Blanchet2019-DRO], to obtain a tractable reformulation. Specifically, duality reveals:

$p_* = P_0(d(X, \mathcal{E}) \leq u_*),$

where $u_*$ solves the scalar equation $h(u_*) = \delta^2$ , with $h(u) := \mathbb{E}_{P_0}[d(X, \mathcal{E})^2; d(X, \mathcal{E}) \leq u]$ . Here, $d(\cdot, \mathcal{E})$ denotes Euclidean distance to $\mathcal{E}$ .

This converts the robust rare-event estimation to computing the probability under the nominal model of an inflated event region ("tube" around $\mathcal{E}$ ), with tube width $u_*$ determined adaptively through $h(\cdot)$ . As the rarity parameter increases, the authors show the worst-case probability decays more slowly than under the nominal model—providing a robust lower bound on tail risk.

Algorithmic Development: Distributionally Robust Importance Sampling (DRIS)

A critical technical challenge is efficient Monte Carlo estimation of both $u_*$ (via evaluating $h(u)$ ) and $p_*$ (via evaluating $P_0(d(X,\mathcal{E}) \leq u_*)$ ), especially as $\mathcal{E}$ becomes increasingly rare. The paper proposes a two-stage IS framework, termed Distributionally Robust Importance Sampling (DRIS), with the following features:

Sampling Design: Rather than naive Monte Carlo, DRIS exploits the geometry of $\mathcal{E}$ and its tube; using a conditional IS scheme, the primary rare-event dimension is exponentially twisted to concentrate samples near the boundary of the inflated event.
Likelihood Ratio Derivation: For convex $\mathcal{E}$ and Gaussian $P_0$ , the likelihood ratio under this drifted sampling is analytically tractable.
Single-Sample Set Reuse: The same sample set is used to solve for $u_*$ (via empirical estimation of $h(u)$ and root-finding) and to estimate $p_*$ , reducing computation.
Complexity: The method imposes minimal sampling overhead; root-finding is deterministic and fast, with no extra simulation burden beyond evaluating distance functionals.

Efficiency Analysis

A major contribution is the establishment of vanishing relative error for the DRIS estimator—an efficiency property rare in rare-event simulation, especially for light-tailed models. Notably, for the family of events $\mathcal{E}_r$ defined as $r$ -shifted versions of a base set (with $P_0(\mathcal{E}_r)$ decaying as $r \to \infty$ ), it is shown that:

The DRIS estimator admits a central limit theorem with limiting variance.
The relative error (standard deviation divided by the rare-event probability) decays to zero as events become rarer, i.e., as $r\to\infty$ .
This vanishing relative error holds with rates exceeding $O(r^{-1})$ , outperforming conventional variance reduction approaches.

Mathematically, for estimator variance $\sigma_r^2$ and probability $p_r$ for the event $\mathcal{E}_r$ , the result

$\limsup_{r \to \infty} r^2 (r - u_r)^2 \frac{\sigma_r^2}{p_r^2} < \infty$

is established, demonstrating asymptotic optimality under this robust formulation.

Empirical Validation

Simulation studies corroborate theoretical claims in both stylized and industry-motivated settings:

In low-dimensional toy convex regions and portfolio loss examples (convex functionals of Gaussian vectors mimicking realistic risk models), DRIS consistently outperforms both naïve Monte Carlo and exponential twisting (ET).
The variance reduction (VR) and efficiency ratio (ER) of DRIS vis-à-vis baseline methods increase rapidly as the target event becomes more rare. For large rarity, DRIS delivers multiple orders-of-magnitude improvements in estimation error for a fixed computational budget.
Notably, DRIS maintains low relative error where MC and ET become computationally prohibitive.

Theoretical and Practical Implications

The DRIS methodology rigorously bridges distributionally robust optimization and simulation, allowing computation of worst-case tail probabilities under explicit Wasserstein ambiguity, which is highly relevant for regulatory risk metrics (e.g., regulatory Value-at-Risk with model risk), robust performance guarantees, and safety certification under model uncertainty. The duality-based reduction renders such robust probabilities computationally tractable even in high dimensions.

On the theoretical side, achieving vanishing relative error for a DRO problem with light-tailed reference models is striking, given known efficiency barriers in this regime. The connection between robust rare-event sets and Minkowski sums provides a geometric handle on model risk.

Future Directions

Several open directions are identified:

Extension beyond Convexity and Gaussianity: Although the method naturally extends to other elliptical distributions via affine transformations, analyzing performance for non-convex sets and non-elliptical distributions (e.g., heavy-tailed or multimodal) remains an open problem.
Alternative Ambiguity Sets: The focus is on 2-Wasserstein balls for analytical and computational tractability; generalization to $p$ -Wasserstein (for $p\neq2$ ) or divergence-based ambiguity sets could further expand applicability, leveraging more general duality results.
Algorithmic Scaling: Further speedup and parallelization for high-dimensional settings, possibly by exploiting proximal geometry of generic rare-event sets.

Conclusion

The paper provides a theoretically principled and numerically validated simulation methodology for distributionally robust rare-event probability estimation under Wasserstein ambiguity sets, establishing strong efficiency properties via vanishing relative error and highlighting practical dominance over classical methods. This work has substantial implications for robust risk management and model uncertainty quantification in simulation-based studies, with direct application to finance, engineering, and safety-critical AI evaluation. The duality-based approach and DRIS algorithm lay a foundation for broader classes of distributionally robust, high-confidence rare-event simulation (2601.01642).