Online Generalised Predictive Coding

Published 4 May 2026 in stat.ML, cs.LG, and q-bio.NC | (2605.02675v1)

Abstract: This paper introduces an extension of generalised filtering for online applications. Generalised filtering refers to data assimilation schemes that jointly infer latent states, learn unknown model parameters, and estimate uncertainty in an integrated framework -- e.g., estimate state and observation noise -- at the same time (i.e., triple estimation). This framework appears across disciplines under different names, including variational Kalman-Bucy filtering in engineering, generalised predictive coding in neuroscience, and Dynamic Expectation Maximisation (DEM) in time-series analysis. Here, we specialise DEM for ``online'' data assimilation, through a separation of temporal scales. We describe the variational principles and procedures that allow one to assimilate data in a way that allows for a slow updating of parameters and precisions, which contextualise fast Bayesian belief updating about the dynamic hidden states. Using numerical studies, we demonstrate the validity of online DEM (ODEM) using a non-linear -- and potentially chaotic -- generative model, to show that the ODEM scheme can track the latent states of the generative process, even when its functional form differs fundamentally from the dynamics of the generative model. Framed from a neuro-mimetic predictive coding perspective, ODEM offers a biologically inspired solution to online inference, learning, and uncertainty estimation in dynamic environments.

Abstract PDF Upgrade to Chat

Authors (5)

Summary

The paper introduces a novel triple estimation method that infers hidden states, parameters, and uncertainty concurrently via variational free energy minimization.
It extends predictive coding to generalised coordinates, encoding higher-order motion derivatives for robust state tracking in non-linear environments.
The ODEM algorithm features constant computational cost per timestep and adapts to non-stationary dynamics, supporting scalability and real-time inference.

Online Generalised Predictive Coding: An Expert Essay

Variational Principles for Online Triple Estimation

The paper "Online Generalised Predictive Coding" (2605.02675) presents a rigorous variational framework for online triple estimation—simultaneous inference of hidden states, model parameters, and uncertainty—in complex, high-dimensional, non-linear dynamical systems. Rooted in the Free Energy Principle (FEP), the approach operationalises minimisation of Variational Free Energy (VFE) as a tractable objective, balancing accuracy and complexity in Bayesian inference. This triple estimation paradigm is synthesised under the generative modelling perspective, wherein online variational inference propagates posterior beliefs about hidden state trajectories, parameter adaptation, and precision modulation in real time.

The separation of inference modalities into fast state estimation (D-step), slower parameter adaptation (E-step), and hyperparameter precision learning (M-step) aligns with neurobiological observations of hierarchical message-passing, as enforced by the slaving principle in synergetic theory. This principled temporal scaling ensures robust adaptation to novel or non-stationary environmental dynamics without the computational burden of offline batch optimisation.

Generalised Predictive Coding and High-Order Coordinates

The central innovation lies in the extension of predictive coding to generalised coordinates of motion (GCM). By augmenting latent state representations with successively higher-order temporal derivatives (velocity, acceleration, jerk, etc.), the formulation circumvents the need for closed-form transition densities or stochastic simulation (as in particle filters or traditional Kalman-Bucy frameworks).

This allows the inference algorithm to encode rich temporal structure and smoothness assumptions directly into the generative model, which is particularly critical when the underlying state-space noise process is non-Wiener or possesses correlated increments.

Figure 1: A realisation of a GLV-GP, that is, the true states with smooth state noise, $x$ .

GCM empowers the model to track complex, non-linear hidden state trajectories, even in scenarios where the functional form of the generative model (GM) diverges fundamentally from the actual generative process (GP).

Online Dynamic Expectation Maximisation (ODEM)

The ODEM algorithm implements online generalised predictive coding within a variational expectation-maximisation framework. The D-step employs gradient-based minimisation of VFE for state estimates after every observation, leveraging the Ozaki update which regularises gradients by curvature information. The E-step and M-step, controlling parameter and precision learning, accumulate gradients over an observation window, implementing Robbins-Monro scheduling for stable adaptation and avoiding catastrophic forgetting.

ODEM achieves constant per-timestep computational cost, facilitating real-time inference in streaming settings and scaling efficiently to high-dimensional, temporally extended domains. The integration of generalised Laplace approximations allows the objective to decompose conveniently into precision-weighted prediction error and complexity terms, reinforcing Parsimony and Bayesian Occam’s razor.

Empirical Evaluation: Robustness and Adaptation

The evaluation framework employs GLV and Lorenz dynamical systems, allowing systematic exploration of inference both when GM and GP coincide (scenario-same) and differ (scenario-different). The use of smooth (colored) noise in the state and observation processes tests the algorithm’s ability to estimate variances and handle non-Markovian structure.

Notably, ODEM demonstrates strong performance in tracking latent states—even with a structural mismatch between model and data generator—upon sufficient order augmentation in GCM. Increasing $k_x$ from 2 to 3 results in sharply reduced posterior uncertainty, faster convergence (see Fig.~\ref{fig:theta_scenario1_kx3_sample}), and improved state reconstruction even for chaotic dynamics.

Figure 2: The evolution of the posterior expectation over $\rho$ in scenario-different, with $k_x = 3$ orders of motion across seven precision prior ratios. The solid lines represent the posterior means at a given time, and the shaded bands represent credible regions within two standard deviations of the mean.

Choice of prior precision ratio $C = \frac{E_{\Pi_y}}{E_{\Pi_x}}$ substantially influences inference behaviour, modulating the tradeoff between adherence to observed sensory signals and internal dynamical consistency (analogous to the Kalman gain mechanism).

Model Selection, Complexity, and Accuracy

ODEM leverages Free Action (FA)—the path integral of VFE—as a model selection criterion, evaluating candidate GMs across a grid of parameter, hyperparameter, and GCM orders. Only GMs sharing identical GCM order are commensurable via FA, reflecting the data expansion induced by coordinate augmentation.

Higher-order GCM increases accuracy for mismatched dynamics at the cost of increased model complexity, with overall FA remaining stable (see Fig.~\ref{fig:fa}). This substantiates the claim that the accuracy-complexity balance is crucial for generalisation and avoidance of overfitting, particularly in dynamic adaptive settings.

Practical and Theoretical Implications

ODEM establishes a scalable, biologically plausible online inference and learning paradigm for continuous-time dynamical systems. The explicit temporal scaling mechanism enables modular, hierarchical inference architectures, with fast state tracking and slower adaptation of parameters and uncertainty. This provides a foundation for robust perception and adaptive behaviour in agents operating under persistent non-equilibrium and information constraints.

The GCM formalism broadens the scope of inference far beyond traditional filtering, allowing principled handling of colored noise, non-stationary environmental statistics, and potential adaptation to arbitrary kernelised covariances (e.g., Matérn). The separation of scales is anticipated to support deep hierarchical predictive coding schemes, offering a mechanism for organising latent dynamics across behavioural and cognitive hierarchies.

Directions for Future Research

Future work will focus on generalising ODEM to hierarchical deep architectures, wherein each level encodes dynamics at a distinct temporal scale and latent state dimensionality. Parameterisation and online inference of the smoothness matrix $S_k(\sigma^2)$ is expected to enhance robustness in the presence of non-stationary or heterogeneous smoothness. Dimensionality reduction (via RGMs or renormalisation group theory) and non-stationary kernel adaptation will further extend the applicability of ODEM to high-dimensional, real-world tasks.

Rigorous exploration of ODEM in the context of biological plausibility, mechanistic models of learning, attention, and adaptive behaviour, as well as real-time perception in robotics and sensor fusion, remains a promising avenue for both theoretical and practical advances in predictive coding and variational inference.

Conclusion

The theoretical and empirical contributions of "Online Generalised Predictive Coding" provide a rigorous operationalisation of online triple estimation via variational message passing, offering practical and scalable solutions for inference, parameter adaptation, and uncertainty estimation in continuous-time dynamical systems. The separation of temporal scales and GCM augmentation underpin the method’s robustness to structural mismatch and support principled adaptation in non-linear, uncertain, and dynamic environments. The implications span cognitive modelling, adaptive control, and real-time perception, positioning ODEM as a paradigm for biologically inspired variational inference and scalable predictive coding in both artificial and natural intelligence systems.

Markdown Report Issue