Integrated Gradients is a path-based feature attribution method that computes individual feature contributions by integrating gradients along a straight-line path from a baseline to the input.
It satisfies key axioms like completeness, sensitivity, and implementation invariance, ensuring reliable and interpretable explanations for differentiable models.
Variants such as Integrated Decision Gradients and Guided IG extend its practical application by addressing challenges like saturation and baseline sensitivity across diverse data modalities.
Integrated Gradients (IG) is a path-based feature attribution method for neural networks and other differentiable models, introduced to rigorously decompose the output difference between an input and a user-defined "baseline" into per-feature contributions. IG has become a canonical technique in explainable artificial intelligence (XAI) due to its rigorous axiomatic foundations and practical flexibility. The method integrates gradients of the model output with respect to the input along a straight path from the baseline to the input, yielding attributions that satisfy completeness, sensitivity, and implementation invariance. IG has inspired a broad family of path-integral attribution techniques, several formal uniqueness results, and a range of variants adapted to different data modalities, model pathologies, and robustness requirements.
1. Theoretical Foundation and Core Axioms
Integrated Gradients is formulated for a differentiable function F:Rn→R, a target input x, and a reference baseline x′. The canonical IG attribution for feature i is
Implementation Invariance: If two models compute the same function, their IG attributions match.
Linearity: For aF+bG, attributions are a\,IG[F]+b\mathrm{IG}[G].</li><li><strong>Sensitivity/Dummy</strong>:Featuresthatdonotaffecttheoutputacrossthepathreceivezeroattribution.</li></ul><p>Foranalyticorpiecewise−analyticfunctions,IntegratedGradientsischaracterizedastheuniquesingle−pathmethodthatsatisfiescompleteness,symmetry,linearity,affinescaleinvariance,andproportionalityamongmonotonestraight−linepaths(<ahref="/papers/2306.13753"title=""rel="nofollow"data−turbo="false"class="assistant−link"x−datax−tooltip.raw="">Lundstrometal.,2023</a>).ThisformalgroundingdistinguishesIGfromheuristicorpartialapproachessuchasinput\timesgradient,DeepLIFT,orLayerwiseRelevancePropagation,whichmayviolatekeyaxioms.</p><h2class=′paper−heading′id=′methodology−computation−and−implementation′>2.Methodology,Computation,andImplementation</h2><p>Inpractice,Fisadeepnetworkforwhichthepath−integralabovelacksaclosedform.IGiscomputedbydiscretizingtheinterval\alpha\in[0,1](typicallyinto20–300steps),evaluatingthegradientateachinterpolationpoint,andaveraging:\mathrm{IG}_i(x) \approx (x_i - x'_i)\cdot \frac{1}{m} \sum_{k=1}^{m} \frac{\partial F(x' + \frac{k}{m}(x - x'))}{\partial x_i}.Computationalcostscaleslinearlywiththenumberofsteps,aseachrequiresaforwardandbackwardpass.Thecompletenesspropertycanbenumericallycheckedtomonitorintegrationandnumericalerror.</p><p>Empiricalusagerequirescarefulbaselineselection,asx'$ must correspond to genuine absence of signal (e.g., black image, zero embedding). Inappropriate baselines can introduce bias or spurious attributions, though the method retains formal guarantees relative to any fixed baseline (<a href="/papers/1703.01365" title="" rel="nofollow" data-turbo="false" class="assistant-link" x-data x-tooltip.raw="">Sundararajan et al., 2017</a>).</p>
<h2 class='paper-heading' id='limitations-and-pathologies-saturation-baseline-sensitivity-and-discrete-domains'>3. Limitations and Pathologies: Saturation, Baseline Sensitivity, and Discrete Domains</h2>
<p>Despite its axiomatic guarantees, IG faces practical challenges:</p>
<ul>
<li><strong>Saturation effect</strong>: For many deep networks, outputs may saturate (i.e. plateau) along the input path before reaching the target; gradients in these regions are near zero, yet IG weights all path segments equally. This can lead to "noisy" or incomplete attributions dominated by uninformative regions (<a href="/papers/2010.12697" title="" rel="nofollow" data-turbo="false" class="assistant-link" x-data x-tooltip.raw="">Miglani et al., 2020</a>, <a href="/papers/2305.20052" title="" rel="nofollow" data-turbo="false" class="assistant-link" x-data x-tooltip.raw="">Walker et al., 2023</a>).</li>
<li><strong>Baseline dependence</strong>: The choice of $x'$ can drastically affect attributed features; standard choices (zero, mean, blurred) are not universally "neutral" (<a href="/papers/2310.04821" title="" rel="nofollow" data-turbo="false" class="assistant-link" x-data x-tooltip.raw="">Liu et al., 2023</a>, <a href="/papers/2503.08240" title="" rel="nofollow" data-turbo="false" class="assistant-link" x-data x-tooltip.raw="">Simpson et al., 11 Mar 2025</a>).</li>
<li><strong>Discrete or off-manifold interpolation</strong>: For data such as word embeddings or graphs, the straight path from $x'toxmaytraversenon−data−manifold,out−of−distributionregions.Gradientscomputedtheredonotreflectsemanticallymeaningfulchanges(<ahref="/papers/2108.13654"title=""rel="nofollow"data−turbo="false"class="assistant−link"x−datax−tooltip.raw="">Sanyaletal.,2021</a>,<ahref="/papers/2412.03886"title=""rel="nofollow"data−turbo="false"class="assistant−link"x−datax−tooltip.raw="">Royetal.,2024</a>,<ahref="/papers/2509.07648"title=""rel="nofollow"data−turbo="false"class="assistant−link"x−datax−tooltip.raw="">Simpsonetal.,9Sep2025</a>).</li></ul><p>Severalapproachesaddresstheseissues,includingalternativepathconstructions,multiplebaselines,anddiscretizationstrategies.Notably,<ahref="https://www.emergentmind.com/topics/adaptive−sampling"title=""rel="nofollow"data−turbo="false"class="assistant−link"x−datax−tooltip.raw="">adaptivesampling</a>andnon−uniformRiemannsumscanreduceintegrationerrorbyconcentratingstepsininformation−richpathsegments(<ahref="/papers/2410.04118"title=""rel="nofollow"data−turbo="false"class="assistant−link"x−datax−tooltip.raw="">Swainetal.,2024</a>,<ahref="/papers/2305.20052"title=""rel="nofollow"data−turbo="false"class="assistant−link"x−datax−tooltip.raw="">Walkeretal.,2023</a>).</p><h2class=′paper−heading′id=′variants−and−methodological−extensions′>4.VariantsandMethodologicalExtensions</h2><p>AdiversearrayofIGextensionsadaptthemethodtodomain−specificrequirements,addressnoise/saturation,orgeneralizetheattributionframework:</p><ul><li><strong>IntegratedDecisionGradients(IDG)</strong>:Weightseachpathwisegradientbythederivativeoftheoutputlogit,focusingondecisionregionsandeliminatingsaturated−regioncontributions.IDGcombinesanimportancefactor\frac{\partial F}{\partial \alpha}withadaptivesamplingalongthepath,empiricallyyieldingsharper,moreinformativeattributions(<ahref="/papers/2305.20052"title=""rel="nofollow"data−turbo="false"class="assistant−link"x−datax−tooltip.raw="">Walkeretal.,2023</a>).</li><li><strong>Path−WeightedIG(PWIG)</strong>:GeneralizestheIGintegralbyassigningauser−definedweightw(\alpha)toemphasizeorde−emphasizespecificpathregions.PWIGenablesfocusonearly,late,orintermediatepathsegments,attheexpenseofcompletenessexceptwhenw\equiv1$ (Kamalov et al., 22 Sep 2025).
Guided IG (GIG): Employs adaptive paths that avoid high-noise, off-object regions by greedily moving along low-sensitivity features and optionally anchoring steps to the straight-line path, significantly reducing noise artifacts in vision models (Kapishnikov et al., 2021).
Counterfactual and Shapley-inspired Baselines: Multiple or data-driven baselines (e.g., Shapley IG) reduce baseline-dependence and align IG with Shapley value theory, achieving improved faithfulness to true feature contributions (Liu et al., 2023).
Manifold and Geodesic IG: On Riemannian manifolds (images, embeddings), IG along geodesics (e.g. as in GIG (Salek et al., 17 Feb 2025) or Manifold IG (Zaher et al., 2024)) produces attributions that align with intrinsic data geometry, reducing spurious attributions and increasing robustness to adversarial perturbations.
Graph and Discretized IG: On non-Euclidean domains, IG is adapted to sum over meaningful discrete paths (e.g., all shortest paths in a graph (Simpson et al., 9 Sep 2025)) or snap interpolants to actual vocabulary embeddings (e.g., DIG, UDIG (Sanyal et al., 2021, Roy et al., 2024)) to ensure path points are data-valid.
A summary of selected variants and domain adaptations is given below:
Standard IG and its extensions have been evaluated across vision, NLP, medical imaging, and GNNs, using datasets such as ImageNet, SST-2, OASIS-1, and ShapeGGen. Evaluation metrics include:
Insertion/Deletion score (RISE AUC): Measures how attributions correspond to true impact on model output.
Softmax and Accuracy Information Curve AUCs: Evaluate attributions via partial image/text reveals.
Sharpness, stability, human alignment: Both qualitative and quantitative (variance under input noise, agreement with human annotators).
IDG demonstrated consistently higher RISE and SIC/AIC insertion AUCs, sharper heatmaps, and lower spurious activations compared to IG, Left-IG, GIG, and Adversarial GI, with up to $5$–15% improvement (Walker et al., 2023). Variants such as GIG, DIG, and Manifold IG have outperformed vanilla IG by similar margins across relevant domain-specific metrics (Kapishnikov et al., 2021, Zaher et al., 2024, Roy et al., 2024).
6. Interpretability, Robustness, and Limitations
Integrated Gradients and its variants provide interpretable decompositions rooted in clear methodology, but their faithfulness depends on path, baseline, and domain alignment. While the original IG method is vulnerable to saturation, path-length cancellation, baseline ambiguity, and non-manifold interpolation, state-of-the-art approaches (IDG, GIG, PWIG, Manifold IG) offer targeted mitigation strategies. However, these often increase computational cost or weaken completeness, and introduce hyperparameter or model structure dependencies. The approximation of the path integral via Riemann sums remains a source of bias unless sample placement is adaptively optimized (Swain et al., 2024).
A plausible implication is that "hybrid" schemes—combining model-based path adaptations, manifold-aware integration, and Shapley-style baseline ensembles—will be essential for trustworthy attributions in high-dimensional, multimodal, or safety-critical settings.
Algorithmic complexity reduction through optimized Riemann sum scheduling, batchwise integration, and instance-conditional path choice (Swain et al., 2024).
Extending completeness and axiomatic properties to non-uniform and non-monotonic path methods.
Cross-domain and structured-data adaptations for GNNs, time series, and multimodal data (Simpson et al., 9 Sep 2025).
Benchmarking via causal and human-centered metrics, to calibrate attribution faithfulness in the absence of ground truth.
Integrated Gradients remains the canonical reference point for axiomatic, path-based attributions in XAI, with its methodological descendants advancing both theoretical and practical rigor across application areas (Sundararajan et al., 2017, Lundstrom et al., 2023, Walker et al., 2023).