System Information Decomposition

Updated 1 March 2026

System Information Decomposition (SID) is a framework that decomposes the joint entropy of multivariate systems into non-overlapping atoms, namely redundant, unique, and synergistic information.
SID employs axioms such as redundancy symmetry and monotonicity to guarantee symmetry and consistency, enabling the detection of irreducible higher-order interactions.
Practical applications of SID include neuroscience and complex networks, leveraging methods like Neural Information Squeezer to estimate and analyze complex informational relationships.

System Information Decomposition (SID) is a theoretical framework for the analysis and decomposition of information in multivariate systems, aimed at capturing not only directional and target-based information interactions—as in classic Partial Information Decomposition (PID)—but the full multi-way, symmetric interrelations among all variables in a system. SID systematically separates information into redundant, unique, and synergistic "atoms," each corresponding to a distinct mode of informational interdependence. Unlike PID, which operates by decomposing mutual information from multiple sources to a single target, SID decomposes the joint entropy of the system itself, thereby respecting the symmetry between all variables and enabling detection of irreducible higher-order structure.

1. Formal Framework and Mathematical Foundations

SID takes as input a system of $n$ discrete random variables $X_1,\ldots,X_n$ and seeks a decomposition of the total joint entropy: $H(X_1, \ldots, X_n)$ into a sum of well-defined, non-overlapping atoms reflecting external, unique, redundant, and synergistic information. The framework is constructed via several axioms:

Redundancy Symmetry: The redundant information shared by any variable $X_i$ with a subset $\mathcal S\subset\{X_1,\dots,X_n\}\setminus\{X_i\}$ is invariant under permutation of $\mathcal S$ .
Self-redundancy: For a singleton set, redundancy reduces to mutual information:

$\mathrm{Red}(X_i:\{X_j\}) = I(X_i:X_j)$

Monotonicity: Redundancy can only decrease as more variables are considered:

$\mathrm{Red}(X_i:\mathcal S\cup\{X_k\}) \le \mathrm{Red}(X_i:\mathcal S)$

Symmetric Bookkeeping: Unique and synergistic information are defined in terms of redundancies and mutual/conditional entropies, e.g., for three variables,

$Un(X_i:X_j) = I(X_i:X_j) - \mathrm{Red}(X_i:\{X_j,X_k\})$

$Syn(X_1,X_2,X_3) = H(X_1|X_2) - H(X_1|X_2,X_3) - Un(X_2:X_1)$

These axioms guarantee that the atoms are universal properties of the joint distribution, not artifacts of variable ordering or target designation (Lyu et al., 2023).

2. Decomposition Structure and Atom Types

For three variables, SID arranges the decomposition as: $H(X_1,X_2,X_3) = \sum_{i=1}^3 \mathrm{Ext}(X_i) + \sum_{i<j} Un(X_i:X_j) + 2 \cdot Syn(X_1,X_2,X_3) + \mathrm{Red}(X_1,X_2,X_3)$ with:

External information: $\mathrm{Ext}(X_i) = H(X_i | X_{-i})$ (portion of $X_i$ not explained by the rest of the system).
Unique information: $Un(X_i:X_j)$ quantifies bits pertaining solely to the pair $(X_i,X_j)$ .
Synergy: $Syn(X_1,X_2,X_3)$ quantifies the irreducible higher-order bits only available when $X_1,X_2,X_3$ are observed jointly.
Redundancy: $\mathrm{Red}(X_1,X_2,X_3) = \sup_Q\{ I(Q:X_1,X_2,X_3): H(Q|X_i)=0\,\,\forall i \}$ .

For general $n$ , there are $O(2^n)$ atoms, classified by order (singletons for external, pairs for unique, $k$ -tuples for redundancy and synergy) (Lyu et al., 2023).

3. Operational and Measure-Theoretic Definitions

SID can be realized in both discrete and continuous settings. In the latter, the local redundant information is given by: $i^{\mathrm{Red}}_{t:\alpha} = \log \frac{d\nu^T_{\alpha,s}}{dP^T}(t)$ where $\nu^T_{\alpha,s}$ is the regular conditional probability of $T$ given the occurrence of at least one realization among the source subsets specified by antichain $\alpha$ , and $P^T$ is the marginal probability of $T$ . Global redundancy follows by taking the expectation over all outcomes.

PID atoms are then obtained via Möbius inversion over the redundancy lattice: $\Pi(f) = \sum_{\beta \preceq \alpha} \mu(\beta,\alpha) I^{\mathrm{Red}}(T:\beta)$ with $\mu$ the Möbius function of the lattice. This formulation applies seamlessly to arbitrary combinations of discrete and continuous variables (Schick-Poland et al., 2021).

4. Symmetry, Consistency, and Comparisons with PID

The central structural advance of SID over PID is symmetry: atoms do not depend on a privileged target, and all variables are treated on an equal footing. In contrast, PID decomposes $I(T;S_1,\ldots,S_n)$ with respect to a designated target $T$ , yielding non-symmetric and target-specific atoms.

SID axioms guarantee that, e.g., the $n$ -way synergy is fully invariant under permutation. This enables the detection of higher-order structure invisible to both mutual information and traditional PID, e.g., in systems composed of independent Boolean “micro-bits” and their XOR (“macro”) combinations, where only SID will reveal nontrivial synergy (Lyu et al., 2023).

5. Subsystem Inconsistency, Limitations, and Generalization Challenges

A major issue with PID (and with lattice-based decompositions generally) concerns the set-theoretic "whole equals sum of parts" (WESP) principle. In systems with genuine synergy, PID violates WESP: the sum of atoms can exceed the total mutual information due to “double-counting” synergy across subsystems (Lyu et al., 16 Oct 2025). SID resolves this for three-variable systems by adjusting summation rules—explicitly subtracting a single synergy atom—so that the decomposition aligns with the actual joint entropy: $H(S_1,S_2,S_3) = \Sigma - \Psi(\{\{ij\},\{k\}\})\,,$ where $\Psi(\{\{ij\},\{k\}\})$ is the triple synergy-redundancy atom. For $n\geq 4$ , no universal adjustment suffices; one must instead move beyond antichain-lattice combinatorics, suggesting the need for alternative structures (e.g., hypergraphs or simplicial complexes) to fully describe the hierarchy of higher-order information relationships (Lyu et al., 16 Oct 2025).

6. Algorithms, Estimation, and Practical Applications

In practice, computation of atoms is intractable for large $n$ due to combinatorial explosion. SID suggests several approaches:

Direct inference in cases with vanishing or trivial pairwise relationships, enabling resolution of atom values from observable quantities.
Neural Information Squeezer (NIS): Based on invertible network bottlenecks, one estimates entropic quantities, and then reconstructs redundancy and synergy by subtraction (Lyu et al., 2023).
Block-structure analysis: For structured distributions (e.g., Boolean XOR ensembles), block-wise properties yield explicit atom values.

Applied domains include:

Neuroscience: Estimation of irreducible synergy among neural populations or brain regions that elude traditional pairwise network analysis.
Causality: Construction of permutation-invariant causal markers transcending O(2)-level tests.
Complex networks: Identification and quantification of higher-order interactions within social or biological networks (Lyu et al., 2023).

SID-type approaches have further been generalized for dynamic and system-environment decompositions using effective information and transfer entropy frameworks (Yang et al., 28 Jan 2025, Mediano et al., 2019, Varley, 2022).

7. Perspectives and Implications

SID elevates the analysis of information from source-target asymmetry to a fully symmetric, system-wide decomposition, revealing structures invisible to classical mutual information and resolving paradoxes inherent in traditional lattice-based approaches. Theoretical limitations—manifest for $n\geq 4$ —motivate future research towards richer combinatorial foundations (beyond antichains), scalable estimation techniques, and the development of new information-theoretic invariants (Lyu et al., 16 Oct 2025, Gutknecht et al., 22 Apr 2025).

A plausible implication is that progress on SID and its generalizations will require new algebraic or topological tools capable of encoding the recursive nesting of synergy within synergy, providing a more complete understanding of high-order information processing in large-scale complex systems.