Geometric PID: Bivariate Info Decomposition
- Geometric PID is an information-theoretic framework that decomposes shared, unique, and synergistic contributions of two sources to a target using KL divergence and convex geometry.
- It employs projections onto convex hulls in probability simplices, offering a clear geometric interpretation with a rigorous axiomatic foundation.
- While ensuring nonnegativity and interpretability, its restriction to bivariate systems and computational overhead highlight challenges for higher-dimensional generalizations.
Geometric Partial Information Decomposition (PID) is an information-theoretic framework designed to disentangle the contributions of multiple information sources to a target variable in terms of redundancy, unique information, and synergy. The Geometric PID formalism offers an operational and mathematically principled construction of redundancy for bivariate systems, rooted in the geometry of probability distributions and Kullback–Leibler (KL) projections. It is notable for its rigorous axiomatic foundation, clear geometric interpretation, and explicit computability, though it is inherently restricted to systems with exactly two sources (Liardi et al., 3 Mar 2026).
1. Formal Definition of Geometric PID
Consider two discrete source variables , and a target with joint distribution . For each in the support of , the conditional distribution is viewed as a point in the probability simplex . Similarly, for in the support of , 0 is defined.
Define for 1 the convex hull 2 and analogously 3 for 4. The information projection (I-projection) of 5 onto 6 is
7
yielding the projected conditional 8. The directed projected information from 9 into 0 is then
1
Redundant information is given by
2
The atoms of the PID lattice are then given by Möbius inversion:
- Redundancy: 3
- Unique information of 4: 5
- Unique information of 6: 7
- Synergy: 8
2. Geometric Interpretation and Information Projection
Each conditional 9 (0) can be interpreted as a point on the 1-simplex. The set of conditionals 2 spans a convex polytope 3. The projection 4 finds the point in 5 that is closest (in the KL sense) to 6. Intuitively, this projects the information that 7 has about 8 onto the “statistical structure” available from 9. This geometry underpins the “shared” content: only information already expressible by 0 conditionals is counted as redundant.
Symmetry is enforced by minimizing the directed projections in both possible directions.
3. Computational Workflow
Computation of the Geometric PID proceeds as follows (Liardi et al., 3 Mar 2026):
- Marginal and Conditional Computation: Compute 1, 2, 3, then the conditionals 4 and 5.
- Convex Hull Construction: Form 6 as the convex hull of 7. For each 8, solve the convex projection (e.g., using Blahut–Arimoto or gradient methods) to find 9.
- Projected Information Calculation: Compute 0 using the projected conditionals.
- Symmetry Step: Repeat for 1.
- Redundancy and Atom Derivation: Assign 2 and derive 3 as above.
The following table summarizes the definitions of the bivariate PID atoms:
| Atom | Formula | Description |
|---|---|---|
| Redundancy | 4 | Information shared by 5, 6 about 7 |
| Unique 8 | 9 | Unique information of 0 |
| Unique 1 | 2 | Unique information of 3 |
| Synergy | 4 | Information only available jointly |
4. Axiomatic Properties and Limiting Results
The Geometric redundancy 5 satisfies the following axioms and properties:
- Self-redundancy (SR): 6
- (Weak) Symmetry (S₀): Invariant under swapping 7
- (Weak) Monotonicity (M₀): Redundancy does not increase when adding a source, 8
- Subset-Equality (SE): If 9 then 0
- Nonnegativity (GP): 1
- Local Positivity (LP): All PID atoms are nonnegative
- Identity (ID): For 2, 3
- Independent-Identity (IID): If 4, 5
- Lower Bound (LB): Redundancy lower-bounded by less-informative surrogates
- Equivalence-Invariance (EI): Invariant to relabeling of variable values
Crucially, several no-go results establish that Geometric PID cannot be consistently extended to more than two sources while retaining all the aforementioned properties plus chain-rule (TC) or target monotonicity (TM). Indeed, Geometric PID fails TM/TC: adding more of 6 can decrease redundancy.
5. Illustrative Example: XOR Gate
For 7 and 8 independent fair bits, 9:
- 0 for 1; thus 2 for all 3 is the simplex center.
- The projections 4, so 5.
- 6, 7, 8 bit, so 9 bit: all information is synergistic, no redundancy. This aligns with the expected behavior for the XOR structure (Liardi et al., 3 Mar 2026).
6. Advantages, Limitations, and Applications
Advantages
- Identity Validity: Satisfies the ID axiom; independent copies do not yield spurious redundancy.
- Nonnegativity and Interpretability: All PID atoms are nonnegative and possess a geometric interpretation as KL projections.
- Label Invariance: Equivalence-invariant under invertible relabeling of variable values.
Limitations
- Bivariate Only: Formalism is restricted to two sources; no generalization exists to higher dimensions that preserves all core properties and ID.
- Violation of Target-Monotonicity: Adding more of 0 can decrease redundancy (TM fails).
- Computational Overhead: For large support on 1, repeated convex optimizations may become computationally expensive.
Use Cases
- Bivariate PID: Settings where two source variables are analyzed for contributions to a target.
- Contexts Demanding Identity and Nonnegativity: Experimental systems needing strict adherence to these axioms.
- Low-dimensional Targets: These facilitate practical geometric projection computation.
7. Relation to Alternative Geometric PID Approaches
A related but distinct geometric PID approach leverages information geometry over partially ordered sets (posets) of variable subsets (Sugiyama et al., 2016). This framework generalizes Amari's hierarchy to enable decomposition on structured spaces, constructing a dually-flat manifold (with 2- and 3-coordinates) for arbitrary posets and deriving PID atoms through Möbius inversion on KL divergence projections. While more general and multivariate, the practical and conceptual constraints differ from the bivariate-focused Geometric PID defined by Harder et al. Thus, users should be cautious to distinguish these two flavors of "geometric" PID, as only the latter corresponds precisely to the KL-projection and simplex geometry described in (Liardi et al., 3 Mar 2026).