Papers
Topics
Authors
Recent
2000 character limit reached

Distribution-Based Sensitivity Analysis

Updated 19 December 2025
  • Distribution-Based Sensitivity Analysis (DBSA) is a framework that evaluates the global impact of distributional perturbations on model outputs by integrating algebraic, information-theoretic, and robust optimization methods.
  • It employs closed-form sensitivity functions, optimal covariation schemes, and divergence metrics (e.g., f-divergence, CvM) to quantify and characterize changes beyond mean-based approaches.
  • DBSA finds applications in discrete Bayesian models, black-box simulations, and causal inference, offering actionable insights for robust decision-making under uncertainty.

Distribution-Based Sensitivity Analysis (DBSA) quantitatively characterizes how changes or uncertainties in probability distributions—whether model parameters, structural assumptions, or input distributions—affect probabilistic outputs, counterfactuals, or key performance indices. DBSA provides a rigorous framework for assessing not merely mean-based or variance-based sensitivity, but the global, distributional impact of perturbations, covariations, or adversarial misspecification, and unifies a diverse suite of tools from parametric models, information theory, robust optimization, and statistical estimation.

1. Algebraic and Polynomial Frameworks in Discrete Models

A foundational setting for DBSA is finite discrete models, such as Bayesian networks and their generalizations. Here, the joint law of a random vector Y=(Y1,...,Yn)Y = (Y_1, ..., Y_n) with parameter vector θRk\theta\in\mathbb{R}^k admits a polynomial representation: for each atomic outcome ω\omega, the probability pθ(ω)p_\theta(\omega) is a sum of monomials in θ\theta, consolidated in an "interpolating polynomial" cP(θ)c_P(\theta) (Leonelli et al., 2015). When the model is multilinear (all exponents in each monomial are $0$ or $1$), as in ordinary BNs, CSI-trees, or chain event graphs, DBSA exhibits several key properties:

  • Closed-form Sensitivity Functions: The probability of any event of interest under a perturbed parameter vector can be written as a multilinear polynomial in the perturbed CPT entries,

fYYT(θ~1,i,,θ~n,i)=j=1najθ~j,i+bf_{Y\in\mathcal{Y}_T}(\tilde\theta_{1,i},\ldots,\tilde\theta_{n,i}) = \sum_{j=1}^n a_j \tilde\theta_{j,i} + b

where aj,ba_j,b depend only on original parameters and the covariation scheme.

  • Chan–Darwiche Distance Factorization: Distributional change is quantified by the Chan–Darwiche (CD) distance,

DCD(pθ,pθ~)=logu,D_{CD}(p_\theta, p_{\tilde\theta}) = \log \frac{u}{\ell},

where uu, \ell are, respectively, the maximum and minimum ratios of perturbed to original atomic probabilities. If no two CPT parameters appear in the same monomial, the distance simplifies to a maximum (resp. minimum) over the varied/covaried CPT entries.

  • Optimal Covariation Schemes: Proportional covariation—setting the non-varied entries to

σpro(θj,s,θ~j,i)=1θ~j,i1θj,iθj,s\sigma_{pro}(\theta_{j,s}, \tilde{\theta}_{j,i}) = \frac{1-\tilde{\theta}_{j,i}}{1-\theta_{j,i}} \theta_{j,s}

—minimizes DCDD_{CD} among all valid covariations. Theorem 4.6 provides a general proof for this optimality across all multilinear models.

If the defining polynomial is not multilinear (e.g., dynamic BNs), sensitivity functions become higher-degree and closed-form optimal covariation need not exist.

2. Distributional Distance and Divergence-Based Indices

DBSA extends to continuous and high-dimensional settings by substituting variance-based indices with metrics quantifying changes between entire distributions.

  • ff-Divergence Indices: For a convex generator ff, the ff-sensitivity index is

Hu,f=EXu[Df(PYPYXu)]H_{u,f} = \mathbb{E}_{X_u}[ D_f(P_Y \| P_{Y|X_u}) ]

(Rahman, 2015). Classical Sobol' indices are special cases (f(t)=(t1)2f(t) = (t-1)^2).

  • Cramér–von Mises Index (CvM): The CvM index measures the integrated squared difference between conditional and unconditional CDFs:

S2,CvM(i)=[E(F(i)(t)F(t))2]dF(t)F(t)(1F(t))dF(t)S_{2,CvM}^{(i)} = \frac{ \int [\mathbb{E}(F^{(i)}(t) - F(t))^2 ] dF(t) }{ \int F(t)(1-F(t)) dF(t) }

(Gamboa et al., 2015). The CvM index is sensitive to global distributional changes and generalizes Sobol' by accounting for the entire output law, not just moments.

  • Discrepancy-based Indices: Discrepancy functions quantify non-uniformity in the empirical joint distribution of an input and output (e.g., star-discrepancy, symmetric or wrap-around discrepancy), often used as computationally efficient proxies for variance or information-based indices (2206.13470).

These measures are invariant to monotonic transformations, satisfy null independence postulates, and can be estimated via kernel density, polynomial surrogate, or finite-sample schemes.

3. DBSA in Stochastic Simulation, Composite Models, and Black-Box Systems

In settings where the underlying model or simulation is only sampled via a black-box (including LLMs, physical simulators, or MCMC-based estimation), DBSA quantifies how distributional outputs change as inputs (tokens, parameters, or stochastic seeds) are perturbed or replaced.

  • Black-box LLM Sensitivity: Perturb each token in a prompt to its nearest neighbors; for each, estimate the change in model output distribution via an energy distance between the embedding clouds of baseline and perturbed model outputs. The resulting per-token sensitivity profile reveals which tokens drive model decisions, without requiring internal gradient access (Rauba et al., 12 Dec 2025).
  • Differentiable Black-Box Sampling: Analytical formulae compute x/θ\partial x / \partial\theta for a sample xx drawn from p(x;θ)p(x;\theta), by exploiting the local invertibility between xx and the vector of conditional CDFs ui=Fi(xixi;θ)u_i = F_i(x_i|x_{-i};\theta). Both full-matrix and diagonal Newton schemes enable black-box, sample-wise differentiation, facilitating automated gradient computation for complex sample-based inference (Chuang et al., 12 Aug 2025).
  • Bayesian Estimation: When only joint (x,y)(x,y) samples are available, Bayesian nonparametric or partition-based methods (e.g., Dirichlet process mixtures, Bayesian bootstrap) yield posterior distributions over probabilistic sensitivity indices (variance-based, density-based, CDF-based, etc.), enabling credible interval estimation even at moderate sample sizes (Antoniano-Villalobos et al., 2019).

4. Sensitivity Analysis under Input Distributional Uncertainty

DBSA is tightly connected to distributional robustness and parametric sensitivity of statistical models:

  • Perturbative Analysis: For an input parameter space θ\theta, parametrize output metrics Uk(θ)U_k(\theta), and seek the direction Δθ\Delta \theta maximizing relative sensitivity,

Ω=k(1UkθUkΔθ)2,\Omega = \sum_k \left( \frac{1}{U_k} \nabla_\theta U_k \cdot \Delta\theta \right)^2,

constrained by a budget on Δθ\|\Delta\theta\|, so that the principal eigenvectors of the moment matrix of score-based sensitivities capture simultaneous directions of maximal distributional perturbation (Yang, 2022).

Ω=ΔθTF(θ)Δθ,\Omega = \Delta\theta^T F(\theta) \Delta\theta,

reflecting the Kullback–Leibler geometry of the underlying model.

  • Robust Bounds: In treatment effect or identification analyses, partial identification bounds derived from LL_\infty-constrained likelihood ratios or ff-divergence balls quantify global distributional uncertainty. Estimators of the sharp lower/upper bounds and their inferential properties are available for a broad array of causal estimands, including average treatment effects, regression discontinuity, and instrumental variables, often expressible in closed-form (Dorn et al., 2023, Jin et al., 2022).

5. Specialized DBSA: Goal-Oriented and Dependent Input Structures

  • Goal-Oriented Sensitivity Analysis (GOSA): Define sensitivity in terms of a user-specified contrast ψ(y;θ)\psi(y;\theta) whose minimizer θ\theta^* is the output feature of interest (mean, quantile, likelihood, etc.). The contrast-based index

Sψk=E[ψ(Y;θ)ψ(Y;θk(Xk))]Ψ(θ)S^k_\psi = \frac{ \mathbb{E}[\psi(Y; \theta^*) - \psi(Y; \theta_k(X_k))] }{ \Psi(\theta^*) }

extends DBSA to arbitrary functional goals, subsuming classical, quantile, probability, and likelihood contrasts (Fort et al., 2013).

  • Dependent Input Structures: For models with dependent or copula-structured inputs, DBSA requires conditional sampling or representation of the output given any subset of inputs. Copula-based and empirical dependency models efficiently facilitate such conditional distributions. First-order and total-effect indices are built analogously, with consistent U-statistic estimators and central limit properties (Lamboni, 2021).

6. Applications in Engineering, Power Systems, and Causal Inference

  • Power Systems: For power distribution with uncertain injections, probabilistic voltage sensitivity uses DBSA to forecast the full distribution of node voltages. Linearized voltage-sensitivity coefficients propagate input uncertainty through the network, with the resulting voltage magnitude following Rician-type distributions that allow explicit computation of violation probabilities, outperforming deterministic methods in both efficiency and coverage (Abujubbeh et al., 2020).
  • Dynamic Structural Models: In dynamic structural economic models (e.g., demand estimation, labor supply), DBSA quantifies the impact of serial dependence misspecification through entropic optimal transport duals and KL-divergence constraints, yielding sharp and bootstrappable identified parameter bounds (Chen, 25 Oct 2025).
  • Causal and Partial Identification: Under unmeasured confounding, ff-divergence-based sensitivity models and semiparametric efficient estimators provide inferentially valid, one-sided robust confidence intervals for counterfactual means or treatment effects (Jin et al., 2022, Zhang et al., 2019, Kline et al., 19 Apr 2025).

7. Practical Considerations, Limitations, and Extensions

  • Optimality and Robustness: Multilinearity and proportional covariation simplify computation and guarantee optimality in discrete graphical models, but non-multilinear models require customized analysis (Leonelli et al., 2015).
  • Computational Strategies: Methods based on discrepancy, kernel density estimation, or surrogate PDD models dominate in high-dimensional, surrogate, or black-box simulation regimes (2206.13470, Rahman, 2015).
  • Interpretability: The choice of sensitivity metric (ff-divergence, CvM, discrepancy, etc.) should reflect the substantive aim—mean-based, tail, or global law sensitivity. Bayesian approaches enable uncertainty quantification without extra simulation cost (Antoniano-Villalobos et al., 2019).
  • Extension to Deep Learning: DBSA methods compatible with automatic differentiation and gradient-based learning close the loop for differentiable probabilistic inference even in black-box, simulation-based science (Chuang et al., 12 Aug 2025).
  • Limitations: Challenges remain for structural models with highly nonlinear or unidentified parameters, high-dimensional density estimation, and for settings where dependency structures or model features preclude direct analytic solution. Carefully chosen deterministic models or high-fidelity surrogates are necessary in such cases.

DBSA thus forms a unified, algebraically and statistically principled paradigm for quantifying, optimizing, and statistically inferring the distributional sensitivity of diverse models, linking algebraic, information-theoretic, and robust optimization perspectives across fields.

Whiteboard

Topic to Video (Beta)

Follow Topic

Get notified by email when new papers are published related to Distribution-Based Sensitivity Analysis (DBSA).