Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 178 tok/s
Gemini 2.5 Pro 50 tok/s Pro
GPT-5 Medium 38 tok/s Pro
GPT-5 High 40 tok/s Pro
GPT-4o 56 tok/s Pro
Kimi K2 191 tok/s Pro
GPT OSS 120B 445 tok/s Pro
Claude Sonnet 4.5 36 tok/s Pro
2000 character limit reached

Amplitude-Based Input Attribution

Updated 6 October 2025
  • Amplitude-based input attribution scores decompose a model's prediction by multiplying the difference between input values and a baseline with the model’s local gradient.
  • They extend to higher-order derivatives, capturing both independent and interactive feature effects for robust explanatory power.
  • Empirical validations show that adhering to principles like minimal approximation error and unbiased baseline selection significantly improves attribution fidelity.

Amplitude-based input attribution scores are a class of feature explanation methods that quantify how much the magnitude (“amplitude”) of a difference between input and a reference point, when multiplied by a derivative of the model output, contributes to the overall prediction. These scores are central in explaining deep neural networks and other machine learning models by decomposing the model’s prediction into additive contributions from each input variable, with the core principle rooted in the Taylor expansion of the predictive function. Amplitude-based approaches are characterized by their explicit dependence on the numerical difference between an input feature and a baseline, thereby giving a quantitative measure of each feature’s importance grounded in the model’s local or path-wise sensitivity.

1. Theoretical Foundations: Taylor Attribution Framework

Amplitude-based input attribution scores are systematically unified under the Taylor attribution framework (Deng et al., 2020), which models the output difference between an input xx and a baseline x~\tilde{x} by a finite Taylor series expansion:

f(x~)f(x)gK(x,Δ)=κK1κ!κfxκ(x)Δκf(\tilde{x}) - f(x) \approx g_K(x, \Delta) = \sum_{|\kappa| \leq K} \frac{1}{\kappa!} \frac{\partial^{|\kappa|} f}{\partial x^\kappa}(x) \Delta^{\kappa}

where Δ=x~x\Delta = \tilde{x} - x and each term (indexed by multi-index κ\kappa) is a product of higher-order derivatives and amplitude powers.

For the first-order (linear) case:

f(x~)f(x)ifxix(x~ixi)f(\tilde{x}) - f(x) \approx \sum_i \frac{\partial f}{\partial x_i}\bigg|_{x} (\tilde{x}_i - x_i)

The amplitude-based attribution for feature ii is then the product of its amplitude difference and the output’s sensitivity to that feature:

ai=fxi(x~ixi)a_i = \frac{\partial f}{\partial x_i} (\tilde{x}_i - x_i)

This decomposition naturally extends to higher-order expansions, introducing second-order “independent” and “interactive” effects between features. The amplitude-based interpretation holds across all such terms, allocating the output change proportionally to each feature according to its contribution’s magnitude.

2. Reformulation of Mainstream Attribution Methods

Seven mainstream attribution algorithms can be rewritten in the Taylor framework as specific choices of terms and baselines (Deng et al., 2020). The table below summarizes the reformulations:

Method Taylor Terms Captured Baseline Selection
Gradient ×\times Input First-order term only Baseline =0= 0
Occlusion-1 First-order and feature-specific higher-order diagonal term Feature zeroed
Occlusion-patch First-order, within-patch higher-order, within-patch interactions Patch zeroed
DeepLIFT/ε-LRP First-order and higher-order terms, baseline-propagated Layer-wise, user-specified
Integrated Gradients First-order, higher-order, split interactions Path-integrated baseline
Expected Gradients Averaged Taylor decompositions (over baseline distribution) Baseline distribution

All of these methods multiply an amplitude Δi\Delta_i by one or more derivatives, but differ in order, assignment, and baseline handling. Methods that average over multiple baselines (e.g., Expected Gradients) control for baseline selection bias and often achieve higher-fidelity attributions by capturing more of the function’s variability along different paths (Deng et al., 2020).

3. Principles for Reliable Amplitude-Based Attribution

The Taylor attribution framework motivates three principles for high-quality attribution (Deng et al., 2020):

  1. Low Approximation Error: The chosen Taylor expansion should closely approximate the true f(x~)f(x)f(\tilde{x}) - f(x). Failing to do so omits significant effects, especially in highly nonlinear regions.
  2. Correct Contribution Assignment: Each Taylor term (independent or interaction) must be allocated to the corresponding feature(s) without leakage.
  3. Unbiased Baseline Selection: The choice of baseline x~\tilde{x} should not introduce artificial bias; a poor baseline misrepresents the true amplitude and distorts importance measures.

Empirical comparison reveals a strong positive correlation between the number of these principles satisfied and observed fidelity/localization metrics across benchmarks such as MNIST and ImageNet.

4. Amplitude, Higher-Order Terms, and Baseline Effects

In the amplitude-based paradigm, the essential element is the scale of the input difference:

ai=fxi(xix~i)a_i = \frac{\partial f}{\partial x_i} \cdot (x_i - \tilde{x}_i)

Higher-order terms account for curvature and inter-feature effects:

  • Independent second-order: (1/2)2fxi2(xix~i)2(1/2) \frac{\partial^2 f}{\partial x_i^2} (x_i - \tilde{x}_i)^2
  • Interactive second-order: (1/2)2fxixj(xix~i)(xjx~j)(1/2) \frac{\partial^2 f}{\partial x_i \partial x_j} (x_i - \tilde{x}_i)(x_j - \tilde{x}_j), assigned (e.g., split halved) between ii and jj

Correct partitioning of these interactions is critical for the completeness of attributions. Amplitude-based scores ignoring higher-order or interaction terms may fail to “explain” output changes in complex models, particularly those with strong feature interactions (Deng et al., 2023).

Moreover, baseline selection heavily influences amplitude values. For example, setting x~=0\tilde{x} = 0 can result in inflated attributions for features with inherently large magnitudes, regardless of their actual causal importance, underlining the necessity of establishing a meaningful baseline (Deng et al., 2020).

5. Empirical Validation and Application

Empirical validation via benchmark datasets shows that Taylor-reformulated amplitude-based inputs closely track the underlying attributions produced by their heuristic originals, with near-zero average percentage change when the Taylor order matches model complexity (e.g., MLPs on MNIST) (Deng et al., 2020). Large discrepancies emerge only in regions where Taylor approximation is insufficiently accurate, highlighting the need for method-model alignment.

Performance metrics for attribution fidelity, such as “infidelity” (how much perturbing high-attribution features changes the output) and object localization accuracy, demonstrate pronounced improvement when amplitude-based attributions adhere to all three principles. For instance, both Integrated Gradients and Expected Gradients (averaged over multiple baselines) consistently outperform single-baseline, purely first-order methods, due to greater coverage of the function’s nonlinearities and more robust feature assignment (Deng et al., 2020).

6. Limitations, Scope, and Theoretical Significance

Amplitude-based scores, while theoretically principled within the Taylor framework, are not universally optimal. Their reliance on the first-order (or locally linear) component will under-explain regions of high nonlinearity or strong cross-feature interaction unless higher-order terms are included. The assignment of interaction terms, and careful selection of unbiased baselines, remains a nuanced and application-dependent task.

Nevertheless, the Taylor-based amplitude paradigm unifies the rationale behind a broad spectrum of widely deployed attribution methods, supplies a rigorous ground for evaluating their rationality and shortcomings, and provides a systematized view of how local, interpretable, and faithful input attributions can be constructed in deep learning and beyond. The strength of amplitude-based input attribution lies in the clarity of its mathematical structure—combining measured magnitude (amplitude) with directional sensitivity (gradient or higher-order derivative)—offering a direct and quantitative account of each feature’s role in the predictive mechanism as grounded in the model itself (Deng et al., 2020).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (2)
Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Amplitude-Based Input Attribution Scores.