Partial Info Decomposition of TDMI

Updated 10 October 2025

The paper introduces a framework that decomposes TDMI into unique, redundant, and synergistic components to clarify temporal dependencies.
It implements spectral and rate-based methods to capture oscillatory and scale-dependent interactions in high-dimensional, time-correlated data.
The work addresses challenges such as lattice inconsistencies and estimator design, offering practical insights for analyzing dynamic systems.

Partial Information Decomposition (PID) of time-delayed mutual information (TDMI) extends classical information-theoretic analysis to the quantification of how temporally separated sources contribute unique, redundant, and synergistic information about a future state in multivariate dynamical systems. Recent research advances have clarified foundational aspects, introduced a variety of rigorous decomposition schemes, and established spectral and rate-based frameworks suitable for high-dimensional, temporally correlated processes.

1. Conceptual Foundations and Classical PID Framework

The PID formalism, initiated by Williams and Beer, decomposes the mutual information that a set of sources provides about a target into non-overlapping informational “atoms” representing redundancy (shared content), uniqueness (exclusive content), and synergy (information revealed only jointly) (Ince, 2017). In the two-source case, the decomposition for a target $S$ and sources $X_1, X_2$ is: $I(S; X_1, X_2) = I_\partial(S; \{1\}\{2\}) \ \text{(redundancy)} + I_\partial(S; \{1\}) \ \text{(unique to %%%%2%%%%)} + I_\partial(S; \{2\}) \ \text{(unique to %%%%3%%%%)} + I_\partial(S; \{12\}) \ \text{(synergy)}$ This architecture relies on a redundancy lattice (antichain of subsets) and Möbius inversion, requiring as input a redundancy measure that determines the “overlap” among predictors.

For time series, sources correspond to time-lagged variables, and the target is a future system state. The total TDMI, $I(Y_{t+\tau}; X_{t-\tau_1}, X_{t-\tau_2},\ldots)$ , encodes the information content in the joint history about a future outcome, which PID aims to partition by temporal contribution type.

2. Extension to Time Series: Rates and Spectral Decomposition

Direct application of PID to time series risks neglecting temporal dependencies and dynamic effects if only zero-lag mutual information is considered. Partial Information Rate Decomposition (PIRD) (Faes et al., 6 Feb 2025, Sparacino et al., 6 Feb 2025) generalizes PID by replacing mutual information (MI) with mutual information rate (MIR), which accounts for the temporal statistical structure: $I_{X;Y} = \lim_{n \to \infty} \frac{1}{n} I(X_1,\ldots,X_n; Y_1,\ldots,Y_n)$ For a target process $Y$ and source processes $\{X_i\}$ , the MIR is decomposed via a lattice into rate atoms: $I_{X;Y} = \sum_{\alpha \in \mathcal{A}} I^\delta_{X_{(\alpha)};Y}$ These “information rate atoms” represent unique, redundant, and synergistic contributions per unit time. For Gaussian processes, this decomposition admits a spectral representation: $I^{\cap}_{X_{(\alpha)};Y} = \frac{1}{2\pi} \int_{-\pi}^{\pi} i^\cap_{X_{(\alpha)};Y}(\omega) d\omega$ with redundancy defined pointwise at each frequency, commonly using a minimum information principle.

This spectral formulation is essential in TDMI analysis, where oscillatory and scale-dependent interactions are present, such as in neural, physiological, or climate data (Faes et al., 6 Feb 2025, Sparacino et al., 6 Feb 2025).

3. Entropy-Based and Pointwise Decompositions

The Partial Entropy Decomposition (PED) (Ince, 2017) extends PID from the mutual information perspective to a direct decomposition of multivariate entropy, using a measure of redundancy based on pointwise common surprisal. For variables $X_1, X_2$ : $H(X_1, X_2) = H_\partial(\{1\}\{2\}) + H_\partial(\{1\}) + H_\partial(\{2\}) + H_\partial(\{12\})$ and the relationship to MI is explicitly given by: $I(X_1; X_2) = H_\partial(\{1\}\{2\}) - H_\partial(\{12\})$ PED clarifies that what is classically viewed as “shared information” (mutual information) conflates genuinely redundant content with negative synergistic (misinformation) entropy. The pointwise decomposition is particularly relevant for time-delayed systems as it distinguishes mechanistic from source redundancy (e.g., in logical circuits, AND/XOR gates) and can reveal differences between systems that are indistinct under the standard Shannon approach.

Negative partial entropy contributions, possible within PED, signal mechanistic redundancy or unique misinformation and offer diagnostic value regarding system structure.

4. Channel-Based, Logic, and Constraint Methods

Several recent advances reinterpret redundancy and synergy not only in terms of average information but through properties of statistical dependence and channel ordering.

Dependency constraint methods (James et al., 2017, Kay et al., 2018) define unique information as the minimum increase in joint mutual information upon enforcing specific source-target dependencies in a constrained maximum entropy model lattice. This approach:
- Provides operational definitions of unique contribution in dynamic or delayed systems,
- Satisfies core PID axioms (symmetry, self-redundancy, monotonicity, and identity) for the bivariate case,
- Admits closed-form implementation for (multivariate) Gaussian processes, thus scalable for high-dimensional time series.
Partial information decompositions based on channel preorders (Gomes et al., 2023) exploit orders such as degradation or Blackwell’s order, defining redundancy by optimizing over auxiliary variables that are “less informative” (or more degraded) than each input channel. In TDMI, these methods naturally compare past states as different “channels” to the future, providing nuanced redundancy estimates that reflect both structural and informational “quality.”
Logic and part-whole relationships (Gutknecht et al., 2020) organize the PID atoms using parthood functions or logical implication lattices. This algebraic framework formalizes when a piece of information is “contained” in a combination of sources and supports unique determination of PID given appropriate redundancies, generalizing to time-ordered or causally structured systems.

5. Practical Algorithms and Mixed-Type Data

New computationally efficient estimators for PID have been developed for both continuous and mixed discrete-continuous variables encountered in empirical time series (Schick-Poland et al., 2021, Barà et al., 20 Sep 2024). Central elements include:

Representing specific mutual information for a target outcome as a Kullback-Leibler divergence between conditional and marginal (or pooled) densities.
Implementing nonparametric estimators, principally based on nearest-neighbor statistics (generalizing the Kraskov–Stögbauer–Grassberger framework), capable of robustly decomposing MI in high-dimensional, sparsely sampled settings.
Extensions to TDMI scenarios by augmenting source variables with time-lagged observations, enabling the decomposition of information transfer from multiple temporal dependencies.

This accommodating methodology supports application to neuroscience (e.g., sensory coding with neural and external delays), physiology (e.g., cardiovascular-respiratory coupling), and forecasting (feature selection among temporally structured predictors).

6. Challenges, Limitations, and Theoretical Controversies

Several theoretical challenges are associated with PID and its application to TDMI:

Lattice-based inconsistency: For three or more sources, PID may be inconsistent (sum of atoms can exceed total information) (Lyu et al., 7 Aug 2025). Impossibility results show that no single lattice-based decomposition can satisfy all desirable properties (additivity, non-negativity, subset consistency) in general higher-order scenarios.
Negative partial values: Entropy- and path-based approaches may yield negative unique or synergistic terms when the underlying assumptions (data completeness, appropriate source/target definitions) are violated (Ince, 2017, Sigtermans, 2020).
Ambiguity in defining optimal pooling or ordering: Approaches that aggregate (pool) marginal distributions to estimate redundancy or synergy introduce necessary but subjective choices (Enk, 2023), further highlighting the non-uniqueness of multivariate information decompositions in practice.

Research continues on operationally meaningful, computationally tractable, and axiomatically robust generalizations beyond the two-source case and on improving estimators for complex, high-dimensional systems.

7. Empirical and Scientific Applications

Application of PID/TMDI frameworks has yielded novel insights in a range of scientific fields:

Climate networks: PIRD (via VAR modeling and spectral decomposition) reveals high-order couplings and redundancy/synergy patterns among oscillatory indices of ENSO, capturing delayed influence structures missed by static PID (Faes et al., 6 Feb 2025).
Physiological systems: PIRD uncovers frequency-dependent redundant and synergistic interactions in cerebrovascular and cardiovascular regulation, emphasizing the need for time- and frequency-aware decomposition strategies (Sparacino et al., 6 Feb 2025).
Neuroscience and Machine Learning: PID-based analysis isolates unique, redundant, and synergistic contributions of temporally or spatially distributed features to decoding or prediction, offering improved interpretability for black-box models and complex network analyses (Schick-Poland et al., 2021, Barà et al., 20 Sep 2024).

These empirical applications highlight the importance of time-resolved decompositions in uncovering functional connectivity, causal structure, and emergent behavior in high-dimensional dynamic systems.

Summary Table: Core PID and TDMI Decomposition Notions

Framework/Method	Redundancy Definition	Applicability to TDMI
Classic PID	Redundancy lattice (I_min, etc)	Limited/static, no rate
PIRD	Minimum spectral MIR pointwise	Dynamic/spectral, ≥2 src
PED	Pointwise common surprisal	Dyadic/triadic, static & dynamic
Dependency constraint	MaxEnt models in lattice	Multivariate, time-lagged, Gaussian
Causal tensor	Path-minimum via chains	MI, Transfer Entropy

The PID of Time-Delayed Mutual Information is now supported by theory and computational methods that permit proper decomposition of temporally structured interaction in complex networks, particularly when implemented within the Partial Information Rate Decomposition (PIRD) and related generalizations. These frameworks resolve many deficiencies of static PID in dynamic contexts, providing interpretations and practical analysis tools for emergent, delayed, and scale-dependent dependencies in scientific data.