Papers
Topics
Authors
Recent
Search
2000 character limit reached

EAP-IG: Adaptive Integrated Gradients

Updated 21 April 2026
  • EAP-IG is a generalization of integrated gradients that uses adaptive, non-uniform weighting along the integration path to capture local data geometry.
  • It preserves the core axioms of standard IG while enabling flexible, data-driven path sampling methods like weighted RiemannOpt and tangential alignment.
  • EAP-IG improves model interpretability in tasks such as manifold alignment and circuit discovery, though it introduces extra hyperparameters and computational overhead.

Integrated Gradients (EAP-IG) generalizes the classical integrated gradients (IG) attribution framework by replacing the uniform integration (Riemann) measure along the straight-line path with a non-uniform or adaptive weighting. This construct expands the expressive capacity of path-based attributions, allowing more flexible and potentially data- or geometry-adaptive integration schemes. EAP-IG (alternatively, "Expected Adaptive Path–IG," Editor's term) maintains the axiomatic core of IG under mild conditions, but diverges from standard uniqueness characterizations. Several recent directions—including its application to manifold alignment (Simpson et al., 11 Mar 2025), faithfulness in circuit discovery (Hanna et al., 2024, Méloux et al., 1 Oct 2025), and weighted Riemann Opt approaches (Swain et al., 2024, Lundstrom et al., 2023)—explore the theoretical and practical impact of EAP-IG.

1. Mathematical Definition of EAP-IG

Consider a differentiable function f:Rn→Rf : \mathbb{R}^n \to \mathbb{R}, a sample x∈Rnx \in \mathbb{R}^n to be explained, and a baseline x′∈Rnx' \in \mathbb{R}^n. Standard IG is defined by

IGi(x;x′)=(xi−xi′)∫01∂f(x′+t(x−x′))∂xidt\mathrm{IG}_i(x; x') = (x_i - x'_i) \int_{0}^1 \frac{\partial f(x' + t(x-x'))}{\partial x_i} dt

EAP-IG generalizes this by introducing a non-uniform weight function w:[0,1]→R+w : [0,1] \to \mathbb{R}_+ with ∫01w(t) dt=1\int_0^1 w(t)\,dt=1, yielding

EAP_IGi(x;x′)=(xi−xi′)∫01w(t)∂f(x′+t(x−x′))∂xidt\mathrm{EAP}\_\mathrm{IG}_i(x; x') = (x_i - x'_i) \int_0^1 w(t) \frac{\partial f(x' + t(x-x'))}{\partial x_i} dt

where ww encodes the preference for sampling at particular locations along the path.

2. Axiomatic Properties and Uniqueness

EAP-IG preserves the completeness, linearity, implemented invariance, non-decreasing positivity, symmetry, and affine scale invariance axioms inherited from classical IG, so long as w(t)≥0w(t)\geq 0 and ∫01w(t) dt=1\int_0^1 w(t)\,dt=1 (Lundstrom et al., 2023). Explicitly:

  • Completeness: x∈Rnx \in \mathbb{R}^n0, since

x∈Rnx \in \mathbb{R}^n1

for any straight-line path x∈Rnx \in \mathbb{R}^n2.

  • Positivity: If x∈Rnx \in \mathbb{R}^n3 is non-decreasing along the path, positivity holds coordinatewise.
  • Symmetry-Preserving: If x∈Rnx \in \mathbb{R}^n4 is invariant under swapping coordinates and x∈Rnx \in \mathbb{R}^n5 respect this, then x∈Rnx \in \mathbb{R}^n6 for x∈Rnx \in \mathbb{R}^n7.

However, uniqueness results that single out the uniform weight (x∈Rnx \in \mathbb{R}^n8) depend on an additional reparameterization-invariance axiom. Any non-constant x∈Rnx \in \mathbb{R}^n9 (yielding EAP-IG) always preserves the classical componentwise axioms, but violates reparameterization invariance (Lundstrom et al., 2023).

3. Algorithmic Variants and Optimization of Integration Paths

The EAP-IG variant enables explicit optimization over the path integral weighting or support. This is operationalized via:

  • Weighted Riemann sampling: One chooses x′∈Rnx' \in \mathbb{R}^n0 not uniformly, but to minimize integral discretization error via a data-driven criterion, as in RiemannOpt (Swain et al., 2024). The optimal breakpoints can be found by minimizing

x′∈Rnx' \in \mathbb{R}^n1

where x′∈Rnx' \in \mathbb{R}^n2 estimates the absolute derivative of the integrand at x′∈Rnx' \in \mathbb{R}^n3.

  • Tangentially Aligned Integrated Gradients: The baseline x′∈Rnx' \in \mathbb{R}^n4 can be optimized to maximize tangential alignment relative to the data manifold, yielding attributions lying in the manifold tangent space (Simpson et al., 11 Mar 2025).
  • Adaptive path selection in circuit mechanisms: In EAP-IG for circuit interpretability (Hanna et al., 2024, Méloux et al., 1 Oct 2025), weights or sampling locations are tailored to reflect intervention relevance or causal saliency.
EAP-IG instantiation Weighting scheme / path Primary function
Uniform (standard IG) x′∈Rnx' \in \mathbb{R}^n5 Canonical path
Data-driven RiemannOpt x′∈Rnx' \in \mathbb{R}^n6 via optimization of integrand var Noise/error reduction
Tangential alignment x′∈Rnx' \in \mathbb{R}^n7 induced by manifold geometry Human-aligned support
Mechanism/circuit focus x′∈Rnx' \in \mathbb{R}^n8 reflects intervention range Causal faithfulness

4. Faithfulness, Variance, and Interpretability in Mechanistic Discovery

EAP-IG models are central to recent progress in finding faithful circuit representations in large neural transformers:

  • Faithfulness is defined as the preservation of task-specific performance after ablating all edges outside the discovered subgraph (Hanna et al., 2024). EAP-IG yields circuits significantly more faithful (i.e., closer to clean model behavior) than vanilla EAP methods, especially at small circuit sizes. This is due to EAP-IG's avoidance of zero-gradient pathologies.
  • Recent work has framed EAP-IG circuits as statistical estimators, assessing structural and performance variance under multiple perturbations (Méloux et al., 1 Oct 2025). High variance and hyperparameter sensitivity, such as to the number of interpolation steps or intervention schemes, have been empirically demonstrated, necessitating routine reporting of stability metrics.

Key stability metrics include:

  • Circuit error (mean classification divergence)
  • Jaccard index (structural edge overlap variance)
  • Response under prompt paraphrasing, data resampling, and controlled random ablation

Best practices now recommend:

  • Reporting mean/variance of faithfulness and structure under resampling
  • Explicit justification and sensitivity sweeps of EAP-IG settings (aggregation, intervention choice)
  • Noise injection stress-testing to reveal instability modes

5. EAP-IG in Manifold-Constrained and Tangent-Space-Optimized Attribution

EAP-IG supports geometric regularization of the baseline and the integration path:

  • In Tangentially Aligned Integrated Gradients (TA-IG), the baseline x′∈Rnx' \in \mathbb{R}^n9 is selected so that the resultant attribution vector is maximally aligned to the tangent space IGi(x;x′)=(xi−xi′)∫01∂f(x′+t(x−x′))∂xidt\mathrm{IG}_i(x; x') = (x_i - x'_i) \int_{0}^1 \frac{\partial f(x' + t(x-x'))}{\partial x_i} dt0 of an empirical data manifold IGi(x;x′)=(xi−xi′)∫01∂f(x′+t(x−x′))∂xidt\mathrm{IG}_i(x; x') = (x_i - x'_i) \int_{0}^1 \frac{\partial f(x' + t(x-x'))}{\partial x_i} dt1 (Simpson et al., 11 Mar 2025).
  • The tangential-alignment score IGi(x;x′)=(xi−xi′)∫01∂f(x′+t(x−x′))∂xidt\mathrm{IG}_i(x; x') = (x_i - x'_i) \int_{0}^1 \frac{\partial f(x' + t(x-x'))}{\partial x_i} dt2 formalizes this principle: the optimal baseline solver seeks IGi(x;x′)=(xi−xi′)∫01∂f(x′+t(x−x′))∂xidt\mathrm{IG}_i(x; x') = (x_i - x'_i) \int_{0}^1 \frac{\partial f(x' + t(x-x'))}{\partial x_i} dt3.
  • Empirically, TA-IG yields attributions much more concentrated in perceptually meaningful, manifold-supported directions than any standard baseline across several image datasets (e.g., IGi(x;x′)=(xi−xi′)∫01∂f(x′+t(x−x′))∂xidt\mathrm{IG}_i(x; x') = (x_i - x'_i) \int_{0}^1 \frac{\partial f(x' + t(x-x'))}{\partial x_i} dt4 vs. IGi(x;x′)=(xi−xi′)∫01∂f(x′+t(x−x′))∂xidt\mathrm{IG}_i(x; x') = (x_i - x'_i) \int_{0}^1 \frac{\partial f(x' + t(x-x'))}{\partial x_i} dt5 for common baselines).

6. Limitations and Open Issues

EAP-IG introduces new classes of hyperparameters and potential sources of instability:

  • The optimality of a weight IGi(x;x′)=(xi−xi′)∫01∂f(x′+t(x−x′))∂xidt\mathrm{IG}_i(x; x') = (x_i - x'_i) \int_{0}^1 \frac{\partial f(x' + t(x-x'))}{\partial x_i} dt6 is context- and task-dependent. Faithfulness or interpretability gains may be offset by sensitivity to the choice of IGi(x;x′)=(xi−xi′)∫01∂f(x′+t(x−x′))∂xidt\mathrm{IG}_i(x; x') = (x_i - x'_i) \int_{0}^1 \frac{\partial f(x' + t(x-x'))}{\partial x_i} dt7, the method for baseline selection, or the geometry of the manifold encoder.
  • Convergence of implicit optimization (e.g., for tangent alignment) is not guaranteed in nonconvex regimes; local minima can yield only approximately tangential attributions.
  • Computational cost increases linearly with the number of integration steps (for most implementations), and for manifold optimization, further overhead arises from tangent estimation.
  • The explanatory utility of EAP-IG variants is bounded by the quality of the generative/discriminative manifold model and the faithfulness of surrogates in physical-design tasks.

Rigid axiomatic uniqueness is only preserved for uniform IGi(x;x′)=(xi−xi′)∫01∂f(x′+t(x−x′))∂xidt\mathrm{IG}_i(x; x') = (x_i - x'_i) \int_{0}^1 \frac{\partial f(x' + t(x-x'))}{\partial x_i} dt8; deviations require careful justification in each context.

7. Practical Guidelines

For effective use of EAP-IG:

  • For circuit discovery in transformers, use IGi(x;x′)=(xi−xi′)∫01∂f(x′+t(x−x′))∂xidt\mathrm{IG}_i(x; x') = (x_i - x'_i) \int_{0}^1 \frac{\partial f(x' + t(x-x'))}{\partial x_i} dt9 interpolation steps and greedily expand the subgraph until w:[0,1]→R+w : [0,1] \to \mathbb{R}_+0\% edge coverage or the target normalized faithfulness is achieved (Hanna et al., 2024).
  • When optimizing Riemann weights for noise minimization, precompute breakpoints on a representative validation subset and reuse for bulk attribution (Swain et al., 2024).
  • For tangentially aligned IG, set latent dimensionality of the autoencoder according to observed data manifold rank, and apply regular projection to keep optimized baselines on manifold (Simpson et al., 11 Mar 2025).
  • Always report circuit faithfulness, Jaccard overlap, and performance variance under data and hyperparameter perturbations, and perform robustness checks with noise injection (Méloux et al., 1 Oct 2025).
  • In image-based tasks, use high percentile clipping of saliency maps and threshold overlays to reveal semantically meaningful attributions; in manifold-constrained settings, validate alignment by measuring the tangentiality score w:[0,1]→R+w : [0,1] \to \mathbb{R}_+1.

EAP-IG’s generalization capacity enables tailored attribution design—either for geometric priors, causal science, or improved noise and faithfulness—which can be further specialized through data- or task-adaptive weighting of the IG integral. This flexibility makes EAP-IG foundational to current and emerging explainability methodologies in high-dimensional, structured domains.

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Integrated Gradients (EAP-IG).