LatentTrack (LT): Online Predictive Filtering
- LatentTrack is a sequential neural architecture designed for online probabilistic prediction under nonstationary dynamics using low-dimensional latent representations.
- It implements a three-phase predict–generate–update filtering pipeline with lightweight hypernetworks and amortized inference for constant-time adaptation.
- Empirical evaluations on the Jena Climate benchmark demonstrate state-of-the-art accuracy and calibrated uncertainty compared to traditional Bayesian and latent-variable models.
LatentTrack (LT) is a sequential neural architecture designed for online probabilistic prediction under nonstationary dynamics, implementing causal Bayesian filtering in a low-dimensional latent space. At each time step, a lightweight hypernetwork generates predictor weights conditioned on the current latent, enabling constant-time adaptation of model parameters without per-step gradient-based training. LT’s formulation generalizes to both structured (Markovian) and unstructured latent transition models, and employs amortized inference to update beliefs with each new observation, facilitating a predict–generate–update filtering pipeline in function space. Evaluated on challenging long-horizon regression (e.g., Jena Climate), LT demonstrates state-of-the-art predictive accuracy and uncertainty calibration against both static Bayesian and sequential latent-variable baselines, particularly under evolving and distribution-shifting data regimes (Haq, 31 Jan 2026).
1. Predict–Generate–Update Filtering in Function Space
LT casts the evolution of the effective predictor as Bayesian filtering over a latent state with an associated summary statistic that aggregates past inputs. This leads to a three-phase filtering pipeline at each timestep :
- Predict (Prior Propagation): Given the historical data , the prior over the next latent is produced as a Gaussian, parameterized either marginally (unstructured) by summary , or in a structured (Markovian) manner conditioned additionally on .
- Generate: Monte Carlo samples of are mapped by a learned hypernetwork to full sets of predictor weights , yielding a mixture predictive distribution for .
- Update: Upon receipt of , an amortized inference network provides the variational posterior distribution , updating the latent belief in constant amortized time (Haq, 31 Jan 2026).
The process maintains fixed per-step computational cost, avoids per-timestep inner-loop learning, and provides calibrated predictive distributions through sampled mixtures of predictors.
2. Latent-Dynamics Model: Structure and Parameterization
LT’s latent-dynamics model can be specialized as follows:
- Unstructured Dynamics: The prior is specified marginally,
where and are neural projections from the recurrent summary of past data.
- Structured (Markov) Dynamics: A Markovian assumption is encoded by
which supports richer temporal correlation and memory effects by explicitly modeling transitions between consecutive latents.
The variational posterior at each step is amortized as
where , are outputs of a neural head conditioned on updated summary . For regression, the observation model is
where .
3. Hypernetwork-Driven Sequential Weight Generation
The hypernetwork parameterizes the mapping from latent state to predictor weights:
- Input: (typical ).
- Output: Full parameter vector for the base regressor (order – parameters).
- Architecture: Example form is a two-layer MLP,
with , linear maps and a nonlinearity (e.g., ReLU).
In contrast with gradient-based adaptation, all weights in the base predictor evolve via changes in and , not via direct SGD steps on .
4. Amortized Inference and Variational Training Objective
Learning proceeds by maximizing a filtering ELBO at each time point:
- Unstructured variant:
- Structured variant:
Summing across the sequence yields the objective approximating the log marginal likelihood of the data: A KL annealing weight may be applied during training.
5. Monte Carlo Filtering and Uncertainty Calibration
At inference time, LT forms predictive mixtures via Monte Carlo filtering:
- Draw samples .
- For each, compute , yielding mixture components.
- Predictive distribution:
- Mixture mean and variance separate aleatoric and epistemic components:
This explicit mixture provides calibrated uncertainty without overconfidence, as evidenced by near-uniform PIT histograms and tight calibration curves in empirical study (Haq, 31 Jan 2026).
6. Computational Complexity and Comparison to Baselines
LT’s adaptation is strictly constant time per step, with per-timestep cost decomposing as: one RNN update, two small network heads (prior and posterior), hypernetwork forward passes, and base-model evaluations. No inner-loop optimization or gradient steps on the prediction model are needed at test time. In contrast, meta-learning and gradient-based adaptation approaches require multiple backpropagations per step. This establishes LT as highly efficient for streaming data scenarios (Haq, 31 Jan 2026).
7. Empirical Evaluation: Jena Climate Benchmark
LT was evaluated on the Jena Climate dataset for long-horizon temperature prediction (36 h ahead, strict causal setting). Training uses 70% of each series (256 window, TBPTT), evaluation on the final 30% over 25 random seeds.
- Baselines: Capacity/computation-matched VRNN, DSSM (stateful latent-variable RNNs), MC‐Dropout, Bayes-by-Backprop, Deep Ensembles (static Bayesian approaches).
- Metrics: Negative log-likelihood (NLL), mean squared error (MSE), per-step ranking, and catastrophic failure rate.
- Key Results:
- LT-Structured achieves median NLL ≈ 2.32 (VRNN: 3.38, DSSM: 2.93); median MSE ≈ 1.93 (VRNN: 45.6, DSSM: 14.8).
- Rank-1 in NLL for 58.8% of steps and MSE for 51.4% (each ≤ 20% for baselines).
- Catastrophic failure rate (max NLL > ): 12% for LT-Structured, 4% for LT-Unstructured, >38% for VRNN and DSSM.
- Calibration: PIT histogram flatter, calibration curve tighter for LT, supporting high-quality probabilistic prediction.
These results demonstrate that latent-conditioned function evolution offers a robust alternative to conventional state-space sequence models in online, distribution-shifting settings (Haq, 31 Jan 2026).