Sensitivity-Efficient Estimators
- Sensitivity-efficient estimators are statistical methods that minimize estimation error in global sensitivity analysis by leveraging controlled surrogate modeling and rigorous risk bounds.
- They employ tensor-product metamodels and orthogonal expansions to compute Sobol’ indices, effectively balancing bias and variance under varying noise levels.
- Empirical validations using benchmark functions confirm rapid error decay and robust performance, guiding practical choices in basis selection and sample allocation.
Sensitivity-efficient estimators are statistical methods constructed to minimize the estimation error or risk associated with global sensitivity analysis indices, particularly under constraints such as limited sample size, model complexity, or the use of surrogate models. These estimators are designed to efficiently use available information and computational resources to provide accurate and reliable estimates of measures such as Sobol’ indices, variance-based or quantile-oriented indexes, and Shapley effects. The theoretical foundation relies on bounding the error of index estimation by leveraging properties of function approximation, sample splitting, orthogonal expansions, and efficient risk bounds, thereby ensuring estimator performance can be explicitly guaranteed and optimized.
1. Definitions and Sobol’ Index Risk Bounds
Let be the model output, with distribution . The key sensitivity indices are:
- First-order Sobol’ index for variable subset :
where $D_U = \Var_\mu[f_U(x_U)]$ is the partial variance from the Hoeffding decomposition, and $D = \Var_\mu[f(x)]$ is the total variance.
- Total-effect index:
A “sensitivity-efficient estimator” of is one whose estimation error is sharply bounded in terms of the surrogate approximation error $E = \|f-\hat f_N\|_{L^2(\mu)}/\sqrt{\Var_\mu[f]}$: and
Thus, the error in sensitivity index estimation is directly controlled by the approximation error of the surrogate to the true model.
2. Sensitivity-Efficient Estimation via Tensor-Product Metamodels
Metamodel-based sensitivity analysis leverages orthogonal expansions: where is a suitable orthonormal basis (e.g., Legendre, Chebyshev, trigonometric polynomials) and , is a truncation set. Given samples :
- Projection estimator:
- Ordinary Least Squares (OLS) estimator: Solves
with solution .
Sobol’ indices and for the surrogate are computed via variance ratios of polynomial coefficients corresponding to the subset .
Theorem (General Sobol’-error bound): Further, in the random design/noise setting (with $R^2 = \mathbb{E} \|f - \hat f_N\|^2_{L^2(\mu)} / \Var_\mu[f]$),
Thus, risk control for Sobol’ estimators reduces entirely to mean-square surrogate error control.
3. Nonasymptotic and Asymptotic Convergence Rates
Assume is -smooth on variables; then
- Legendre (Algebraic) basis: with ,
- Trigonometric/Chebyshev:
In the noiseless OLS regime (), balancing bias and variance yields for Legendre and for trigonometric, resulting in MSE rates up to or , respectively. For the noisy case, balancing (approximation bias) with (variance) gives the minimax-optimal rate .
These rates surpass the Stone minimax rate in the absence of noise and confirm that, for sensitivity-efficient estimators based on stable, well-chosen bases, index risk decays rapidly with and smoothness .
4. Algorithmic Construction and Practical Guidelines
An effective workflow for constructing sensitivity-efficient estimators:
- Assess model smoothness to select appropriate basis/truncation.
- Determine basis : for noiseless/low-noise, maximize within stability constraints (e.g., ); otherwise, balance and .
- Construct metamodel (projection or OLS; ensure well-conditioned information matrix).
- Compute holdout RMSE (via cross-validation or validation set).
- Apply risk bounds: guarantee error for all .
- Optimize bias-variance trade-off by varying and as resources permit.
- For high-accuracy requirements, use trigonometric bases if possible due to superior stability and convergence.
5. Empirical Validation and Error Control
Empirical studies on analytic benchmarks (Sobol’ -function, Ishigami function) confirm that:
- The deterministic and probabilistic risk bounds are tight.
- For small or near-one values of , refined error bounds are achieved.
- Theoretical RMSE correlates closely with observed index error in practice, outperforming bootstrap-based confidence intervals, especially in finite samples or in the presence of metamodel bias.
- Risk decay versus matches the predicted rates, and bias-variance separation is evident: increasing reduces bias but increases variance for small , and vice versa.
6. Practical Significance in Sensitivity Analysis
Sensitivity-efficient estimators enable reliable and computationally tractable global sensitivity analysis for complex models, especially where:
- Model evaluations are costly, and surrogate modeling is necessary,
- The number of variables is moderate to large,
- The practitioner demands provable control on estimation error and wishes to balance effort between model runs and statistical risk.
By explicitly relating Sobol’ index risk to metamodel accuracy, practitioners can monitor and guarantee sensitivity estimation quality throughout the analysis workflow, leading to improved reliability, especially for risk-critical applications. These methods are robust to noise and provide clear guidance for sample allocation, basis selection, and adaptive refinement.
7. Summary Table: Key Properties of Sensitivity-Efficient Metamodel Estimators
| Property | Methodology | Achievable Rate/Bound |
|---|---|---|
| Deterministic index error bound | Any basis/projection | |
| Mean-square index risk | General metamodel | |
| Noiseless OLS, trigonometric | ||
| Noisy, balance bias/variance | ||
| Index risk control mechanism | Holdout RMSE | Templated: input/output agnostic |
| Robustness to bias | Surrogate error propagation | Always upper-bounds index error |
Sensitivity-efficient estimators thus represent a rigorous, practical, and theoretically sound solution to risk control in global sensitivity analysis via metamodeling, directly connecting estimation risk to controlled surrogate modeling error.