Global Sensitivity Analysis Overview
- Global Sensitivity Analysis is a framework that decomposes output variance into contributions from input uncertainties using techniques such as Sobol' indices.
- It utilizes surrogate models, dependence measures, and efficient estimators to rank input importance and simplify complex models.
- GSA methods improve model validation, guide experimental design, and enhance interpretability across engineering, biological, and machine learning applications.
Global sensitivity analysis (GSA) is a collection of quantitative methodologies for assessing how uncertainty or variability in a model's outputs can be attributed to uncertainties in its inputs, considered over the entire input domain. GSA stands in contrast to local sensitivity analysis, which typically studies single-point derivatives, by capturing both main effects and interactions among multiple uncertain inputs. The field encompasses a variety of frameworks: classical variance-based indices such as Sobol' measures, dependence-based indices like Hilbert–Schmidt Independence Criterion (HSIC), derivative and entropy-based proxies, surrogate and tensor methods for computational efficiency, and extensions to complex domains such as stochastic simulators, Bayesian networks, rare event probabilities, and mixed-variable design spaces. GSA methods provide insights critical to model validation, dimension reduction, prioritization of experimental effort, and improved interpretability in statistical, engineering, biological, and machine learning systems.
1. Classical Variance-Based Sensitivity Indices
The canonical approach to GSA is variance-based decomposition, most notably embodied in the Sobol' indices framework. For a model with independent inputs, the output variance is partitioned as
where %%%%1%%%% is the partial variance attributable to simultaneous variation of the inputs .
The first-order Sobol' index for ,
quantifies the main effect. The total effect index,
captures both the main and all higher-order interaction effects involving . These indices are usually estimated via Monte Carlo methods or surrogate models (e.g., polynomial chaos expansions), and provide a ranking of input importance, facilitating model simplification and prioritization of verification or measurement tasks (Becker et al., 2014, Janon et al., 2014, Boas et al., 2015, Palar et al., 2017, Etoré et al., 2018).
Specialized fast estimators (such as Jansen’s or pick-freeze formulas), and variance decompositions based on an ANOVA (Hoeffding) expansion, further enable practical computation of Sobol' indices for high-dimensional and complex models, especially when combined with sparse grid integration and metamodels (Palar et al., 2017).
2. Advanced Frameworks: Surrogate Models, Dependence, and Rank Statistics
Recent GSA research expands beyond variance-based metrics:
- Surrogate models (e.g., polynomial chaos expansions, tensor decompositions, generalized lambda surrogates) enable cost-effective GSA for computationally intensive or high-dimensional models. These representations allow direct extraction of sensitivity indices from metamodel coefficients and tractable estimation when the model output is not analytically available (Palar et al., 2017, Lin et al., 2020, Zhu et al., 2020, Comlek et al., 2023).
- Dependence measures: Measures like the Hilbert–Schmidt Independence Criterion (HSIC) are used to quantify sensitivity via input–output statistical dependence, addressing situations where variance-based indices are less appropriate (e.g., non-Gaussian, non-monotone relationships, interactions, or uncertain input distributions). These methods are computationally efficient and robust to output properties, lending themselves to both primary and second-level GSA (sensitivity to the choice of input noise law) (Meynaoui et al., 2019).
- Rank-based indices: Motivated by the need for nonparametric, efficient estimators, recent frameworks leverage order (rank) statistics and empirical correlation coefficients (notably, Chatterjee's) to define consistent and sample-efficient estimators for a wide range of sensitivity indices (Cramér–von Mises, higher moments, Shapley, and even Sobol' indices), though detailed formulations require further reference to the full texts (Gamboa et al., 2020).
3. Extensions to Complex Model Domains
GSA's reach extends to models with features beyond standard deterministic mappings:
- Stochastic simulators: Here, the model output is itself a random variable for given inputs. GSA is implemented by (a) adapting classical indices to include both input and intrinsic randomness, (b) considering indices based on output quantity-of-interest summaries (mean, variance, quantile, entropy, etc.), or (c) using generalized lambda distribution (GLD) surrogates to efficiently model response distributions and compute sensitivity rankings without replicative designs (Zhu et al., 2020).
- Differential equations and SDEs: For models governed by partial or stochastic differential equations, GSA typically combines Feynman–Kac representations, polynomial chaos expansions, and stochastic Galerkin projections. Analytical formulas for Sobol' indices are derived from orthonormal basis expansions, and computations are performed through projection in the augmented (parameter–state) tensor space (Janon et al., 2014, Etoré et al., 2018).
- Mixed-variable design spaces: To handle qualitative (categorical) as well as quantitative inputs, frameworks incorporating Latent Variable Gaussian Processes (LVGPs) project discrete variables to continuous latent spaces. Modified Sobol' analysis is then performed using efficiently sampled designs that cover both continuous and discrete dimensions (Comlek et al., 2023).
- Bayesian networks and probabilistic graphical models: Global sensitivity indices are computed for outputs represented as functions of network parameters (e.g., CPT entries). Tensorization—via tensor trains or general tensor network contraction—enables efficient GSA even as network dimensionality increases, capturing uncertainties and dependencies not addressed by traditional one-at-a-time (OAT) methods (Ballester-Ripoll et al., 2021, Ballester-Ripoll et al., 9 Jun 2024).
4. New Indices and Interpretability: Shapley, Entropic, and Extremum-Based Sensitivity
- Shapley effects: Originating in cooperative game theory, Shapley-based GSA indices assign each variable a share of the output variance based on its average marginal contribution over all input orderings. They resolve the lack of additive decomposition inherent in Sobol' main and total indices, guarantee interpretability, and are robust to input dependence (Goda, 2020, Mazo, 10 Sep 2024).
- Entropy-based and derivative-based proxies: Entropic sensitivity indices, based on conditional output entropy, complement variance-based measures—important for heavy-tailed or non-Gaussian settings. Recent work derives efficient upper bounds on entropy-based indices via partial derivatives of the model, leading to computationally attractive entropy–DGSM proxies, especially when traditional total-effect DGSM fails to screen influential variables (Yang, 2023).
- Extremum indices and Monte Carlo filtering: In applications focused on tail events or local regions (e.g., reliability, risk, optimization), extremum Sobol' indices target the sensitivity of inputs conditioned on extreme outputs. Monte Carlo filtering and polynomial chaos ridge approximations are used to estimate indices within tail regions, capturing context-specific importance distinct from global indices (Wong et al., 2019).
5. Practical Workflows, Computational Strategies, and Applications
Researchers have developed robust GSA workflows tailored for practical constraints and model features:
- Metamodel-accelerated GSA: Reduced bases, polynomial chaos expansions (standard, multi-fidelity), and tensor factorizations (e.g., tensor trains) allow scaling analysis to high dimensions or expensive simulations by minimizing the number of required evaluations (Palar et al., 2017, Lin et al., 2020, Ballester-Ripoll et al., 9 Jun 2024).
- Model selection and regularization: GSA has found application in variable selection and model parsimony. For regression and time series, integrating total sensitivity indices with stepwise selection yields more robust variable recovery than traditional t-statistic ordering, particularly when regressors interact (Becker et al., 2014).
- Second-level GSA: When there is epistemic uncertainty about the distributions themselves (not just their parameter values), second-level GSA quantifies the impact of distributional choice on standard GSA rankings. Single-loop weighted estimators for dependence measures (e.g., HSIC) efficiently address this issue (Meynaoui et al., 2019).
- Machine learning explainability: Embedding GSA into predictive models like Random Forests yields new variable importance (VI) rankings that reflect a "generative" notion of importance, improving interpretability over standard permutation or accuracy-based VIs by capturing main and interaction effects (Vannucci et al., 19 Jul 2024).
6. Future Directions and Recent Paradigms
Contemporary research is driving fundamental shifts in GSA:
- Arbitrary variability measures: The "factorial experiment" paradigm (Mazo, 10 Sep 2024), in which sensitivity maps are defined for any divergence, enables analysis for arbitrary distributional features—not only variance—without requiring independence or functional decomposability. Factorial effects (with or without Möbius or Shapley weighting) now encompass the classical Sobol' and Shapley indices as special cases, unify the field, and clarify the interpretation of interactions and main effects even for dependent inputs or user-defined importance measures.
- Integration with Bayesian optimization and design: Sensitivity-aware search is used in multi-objective optimization, e.g., for high-throughput materials discovery, where rapid identification of key variables accelerates Pareto front sampling in large combinatorial spaces (Comlek et al., 2023).
- Efficient analysis under rare events and complex uncertainty: Methods that combine rare-event simulation (subset simulation) with PCE surrogates enable efficient GSA of extremely small output probabilities, e.g., in reliability/failure analysis or subsurface flow (Merritt et al., 2021).
- Generality and extensibility: Advanced GSA frameworks now accommodate stochastic models, dependent or partially known input distributions, mixed-variable designs, and target tail-focused or information-theoretic uncertainty quantification.
7. Summary Table: Major GSA Approaches
Approach/Index | Key Features | Typical Application |
---|---|---|
Sobol' Indices (variance-based) | Decomposes total variance; interpretable main/total/interaction effects | Physical, engineering, regression models |
HSIC/Dependence Measures | Captures non-linear dependence; efficient for non-Gaussian relations | Simulation, process models |
Derivative-Based (DGSM, Entropic) | Uses partial derivatives; links to variance and entropy | Screening, high-dimensional/mixed distributions |
Polynomial Chaos/Tensor Surrogates | Drastically reduces computation in high-dimensions; analytic formulas | Engineering PDEs, stochastic simulators |
Shapley Effects | Additive, robust to interactions and dependence; cooperative interpretation | Model interpretability, complex statistics |
Extremum/Local/Conditional Indices | Sensitivity in tails/exceedances; MC filtering, skewness, ridges | Reliability, risk, optimization |
References
- (Becker et al., 2014) Exploring Hoover and Perez's experimental designs using global sensitivity analysis
- (Janon et al., 2014) Global sensitivity analysis for the boundary control of an open channel
- (Boas et al., 2015) A global sensitivity analysis approach for morphogenesis models
- (Hart et al., 2017) Global sensitivity analysis for statistical model parameters
- (Palar et al., 2017) Global Sensitivity Analysis via Multi-Fidelity Polynomial Chaos Expansion
- (Etoré et al., 2018) Global sensitivity analysis for models described by stochastic differential equations
- (Meynaoui et al., 2019) New statistical methodology for second level global sensitivity analysis
- (Wong et al., 2019) Extremum Sensitivity Analysis with Polynomial Monte Carlo Filtering
- (Lin et al., 2020) Global Sensitivity Analysis in Load Modeling via Low-rank Tensor
- (Gamboa et al., 2020) Global Sensitivity Analysis: a new generation of mighty estimators based on rank statistics
- (Zhu et al., 2020) Global sensitivity analysis for stochastic simulators based on generalized lambda surrogate models
- (Goda, 2020) A simple algorithm for global sensitivity analysis with Shapley effects
- (Pyrkov et al., 2021) Global sensitivity analysis for optimization of the Trotter-Suzuki decomposition
- (Ballester-Ripoll et al., 2021) Global sensitivity analysis in probabilistic graphical models
- (Merritt et al., 2021) Global sensitivity analysis of rare event probabilities
- (Yang, 2023) Derivative based global sensitivity analysis and its entropic link
- (Comlek et al., 2023) Mixed-Variable Global Sensitivity Analysis For Knowledge Discovery And Efficient Combinatorial Materials Design
- (Ballester-Ripoll et al., 9 Jun 2024) Global Sensitivity Analysis of Uncertain Parameters in Bayesian Networks
- (Vannucci et al., 19 Jul 2024) Enhancing Variable Importance in Random Forests: A Novel Application of Global Sensitivity Analysis
- (Mazo, 10 Sep 2024) A new paradigm for global sensitivity analysis