Locally Additive Explanation
- Locally additive explanation is defined as the property where local optimal behavior in a system decomposes additively across its constituent parts.
- It applies to quantum information theory, statistical varying-coefficient models, and interpretable machine learning, ensuring local minima sum to expected values.
- Rigorous proofs use perturbation analysis and operator inequalities, highlighting practical insights in entropy minimization and modular model interpretations.
A locally additive explanation refers to a property or mechanism, in various scientific domains (including quantum information theory, statistics, and machine learning), whereby the behavior or output of a model, channel, or function in a local neighborhood of an optimal point decomposes additively across constituent parts or features. This concept is explicitly characterized for the minimum entropy output of quantum channels in both the von Neumann and Rényi entropy settings, for locally stationary varying-coefficient models in statistics, and for global and local explanations in modern interpretable machine learning frameworks.
1. Quantum Channel Entropy: Local Additivity in Von Neumann and Rényi Outputs
In quantum information theory, the minimum entropy output of a quantum channel is defined as
where is the von Neumann entropy and is a density matrix. For Rényi entropy (parameter ), the analog is
with .
A quantum channel (CPTP map) can be equivalently described via a subspace (unitary embedding and partial trace), with entropy minimized over pure state vectors . The local additivity property states:
- If and are local minima of the entropy function (and at least one is non-degenerate), then
is also a local minimum in the product subspace , and its entropy equals the sum of the individual entropies, up to second-order perturbation analysis (Gour et al., 2011, Gour et al., 2016).
For the Rényi case, the result holds rigorously for by exploiting the multiplicativity of the -norm function , and showing that if local optima exist for each subspace, their tensor product remains a local optimum for the joint system.
2. Distinction from Global Additivity and Non-Additivity
Global additivity conjectures were historically presumed for quantum channel capacities and entropy outputs. However, Hastings' counterexample and work by Hayden and Winter demonstrated global non-additivity: existence of quantum channels , such that
Analogous subadditivity occurs in the Rényi case for .
Locally, the second-order behavior of the entropy function around any local minimum remains strictly additive; the tensor product of local minima does not introduce directions in state space where the entropy can "drop below" the sum of individual minima. Thus, non-additivity is a global phenomenon, emerging from complex global correlations in the tensor product space rather than any breakdown of additivity near optimal points.
3. Mathematical Formalism and Proof Techniques
Local additivity is formally established via analysis of first and second directional derivatives of the entropy functional. For a matrix corresponding to a pure state,
Consider a perturbation , with . The second directional derivative, crucial for establishing local minimum properties, is
with eigenvalues of . More generally, divided difference notation allows
where .
For Rényi entropy, convexity and operator inequalities for -norms () are exploited for , and the argument proceeds via analysis of Taylor expansions and perturbative stability, with precise attention to cross-term cancellation and spectral properties.
4. Statistical Models: Locally Additive Structure in Varying-Coefficient Additive Models
Locally additive explanations also manifest in statistical modeling, notably in locally stationary varying-coefficient additive models. In this formulation, the regression function is decomposed as: At each time segment , the function is locally additive: the sum of an intercept and the contributions from each predictor, scaled by time-varying coefficients. A sequential estimation procedure using splines (three-step method) is proposed, enabling efficient identification of pure additive vs. varying-coefficient components (Hu et al., 2016).
This framework provides interpretable decomposition of complex, nonstationary regression surfaces into locally additive parts, enabling consistent estimation and meaningful model selection in high-dimensional, dynamic environments.
5. Locally Additive Explanations in Machine Learning Model Interpretation
In interpretability for black-box models, locally additive explanations are formalized by additive decomposition of prediction functions: Such decompositions are constructed via partial dependence plots, Shapley additive explanations (both local and global variants), distillation into additive surrogate models, and gradient-based first-order approximations (Tan et al., 2018, Bordt et al., 2022, Wei et al., 2023).
Distinct explanation methods handle non-additive interactions differently: PD and marginal plots recover main or total effects, Shapley methods apportion interaction terms among participating features, and distillation allocates interaction effects by best-fit error minimization.
Locally additive model interpretations provide accessible insight into the contributions of individual features, though trade-offs exist regarding faithfulness to the true, possibly non-additive black-box, and interpretability for users.
6. Implications and Applications
Local additivity clarifies that subadditivity in global optimization reflects entirely emergent, non-perturbative phenomena—no local improvement to optimal states is possible via entanglement or interaction. This distinction underpins differential methods used for capacity analysis and provides assurance that local perturbation-based analysis remains valid in practice, despite global non-additivity.
In machine learning, locally additive explanations (as per Shapley values and related techniques) support trusted, transparent decision-making, especially in critical domains (e.g., cybersecurity, medicine) where accountability is paramount. Algorithms and software packages now exist to facilitate such explanations, with ongoing research in improving regularization, stability, and scalability.
7. Connections and Limitations
Local additivity holds for complex-valued quantum channels but not for real channels, where counterexamples exist—demonstrating the crucial role of quantum mechanical structure. In statistical modeling, local additivity enables parsimonious, interpretable representations even in the presence of nonstationarity. For Rényi entropy, local additivity is restricted to due to breakdowns in key convexity properties and operator inequalities for ; this marks the boundary of applicability and suggests future research directions in extending and understanding the limitations of local additivity in non-convex settings.
In summary, locally additive explanation is a robust technical property documented across quantum information theory, statistical modeling, and interpretable machine learning, enabling modular decomposition of complex systems or models into manageable, interpretable parts in neighborhoods of optimality—while retaining awareness of sharp transitions and deficiencies at the global scale.