Nested Particle Filters for Online Inference
- Nested Particle Filters (NPFs) are a two-level sequential Monte Carlo method that uses coupled outer parameter particles and inner state particles for online Bayesian inference.
- NPFs recursively update weighted particle sets to approximate both the parameter posterior and state trajectories, enabling efficient handling of high-dimensional and non-Markovian models.
- The method offers rigorous convergence guarantees and lower computational complexity compared to alternatives like SMC², making it attractive for real-time parameter learning and experimental design.
A nested particle filter (NPF) is a two-level, fully sequential Monte Carlo method designed for efficient online Bayesian inference in state-space models with unknown static parameters. By explicitly maintaining two coupled populations of weighted particles—one for the parameters and one for the latent state trajectories—NPFs provide consistent approximations to the sequence of posterior probability measures over both parameters and system states, with rigorous guarantees for convergence and computational complexity. NPFs achieve online, recursive computation and are especially effective in high-dimensional and non-Markovian models where standard particle filters and single-level SMC methods degenerate or become computationally intractable.
1. Model Structure and Mathematical Notation
Consider a discrete-time state-space Markov model indexed by a static parameter :
- ,
- ,
- .
For fixed , define:
- the filter: ,
- the predictive: .
The primary objective is to recursively approximate the parameter posterior and, if required, the joint posterior .
2. Nested Particle Filter Algorithm
An NPF maintains:
- "outer" particles with weights approximating ;
- For each -particle, an "inner" particle filter of size approximating the state filter .
At each time step :
- Parameter jitter (rejuvenation): For , sample , a jittering kernel with .
- Inner filter update: Treat as samples from (by continuity of ). For :
- Propagate: ,
- Weight: ,
- Normalize weights, resample to obtain .
- Compute prediction empirical measure: .
- Compute marginal likelihood estimate: .
- Parameter weight update: ; normalize: .
- Outer resampling: Resample and associated by ; reset all weights to $1/N$.
Key formulas:
- Parameter posterior: .
- State filter approximation: .
3. Theoretical Properties and Computational Analysis
Computational cost: Each time step involves per outer particle (for state propagation, weighting, resampling) and for parameter normalization/resampling; total per step, or across steps. In contrast, the SMC method requires due to re-running inner filters for each outer propagation step.
Convergence: Under bounded support for , uniform boundedness/positivity of , Lipschitz continuity of in , and suitable jittering kernel properties, the error for any bounded test function satisfies:
The error in approximating the joint posterior enjoys the same rate.
For typical trade-offs, setting (balancing variance contributions) minimizes the error.
4. Connections and Comparisons to Other Nested SMC Methods
Relationship to SMC: Both methods use a two-layer particle structure. SMC performs a particle MCMC move on parameters, assigning weights via the full likelihood and is not recursive in , leading to cost. The NPF is recursive, only using the latest filtering distributions and fresh rejuvenation, avoiding MCMC moves and resulting in cost. SMC achieves convergence for fixed , while NPF's convergence requires both but attains an overall rate.
Inside-Out Variants (for Risk-Sensitive and Experimental Design Applications):
- Inside-Out SMC (IO-SMC) forms the nested structure by propagating augmented trajectories in the outer filter, with an inner IBIS filter tracking , integrating design selection with posterior tracking. The algorithmic core relies on resampling/tempering and resample-move steps for both levels. The method is well suited for risk-sensitive policy optimization in experimental design with non-exchangeable data, providing computational cost (often for ) (Iqbal et al., 2024).
- Inside-Out Nested Particle Filter (IO-NPF) further improves efficiency by replacing costly inner MCMC moves with rejuvenation steps (random jitter kernels), fully recursive updates, and empirically favorable scaling for online design in non-Markovian models, achieving amortized cost (for ) in amortized Bayesian experimental design. IO-NPF allows for backward-sampling smoothers of cost to address path degeneracy (Iqbal et al., 2024).
5. Backward Sampling and Smoothing
Degeneracy of genealogy-tracking in sequential particle smoothing limits the recovery of joint trajectories as increases. IO-NPF (and related algorithms) utilize backward-sampling schemes of the “sparse MCMC” type, performing accept-reject passes in reverse time using Rao–Blackwellized transition probabilities:
- At : Sample index from outer weights.
- For down to $0$, propose ancestor index, compute acceptance ratio based on the ratio of forward weights and likelihoods down the trajectory using Rao–Blackwellized marginals.
- This approach yields an smoother for the full trajectory, with correct invariant law.
The backward sampler corrects for trajectory degeneracy at negligible additional cost per outer iteration and is applicable in risk-sensitive, non-Markovian, and non-exchangeable settings, as demonstrated on challenging nonlinear dynamical examples.
6. Empirical Performance and Practical Implementation
In nonlinear and partially observed dynamical systems (e.g., Lorenz-63, stochastic pendulum), NPFs with exhibit parameter estimation error scaling as over tens of thousands of steps, with stable long-run online performance. For Bayesian experimental design, IO-NPF matches or outperforms IO-SMC in EIG-optimal policy estimation, while running an order of magnitude faster (s vs $5.7$s per amortization iteration for ). Adding the backward-sampling pass further improves efficiency and effective information gain, with nearly optimal performance compared to exact and implicit baselines at a fraction of the computational cost (Iqbal et al., 2024, Iqbal et al., 2024).
Implementation requires:
- A mechanism for evaluating in closed form for both inner and outer steps.
- Tuning jitter kernel variance (typically decaying as $1/N$ or $1/M$) to provide sufficient exploration while controlling bias.
- Choice of to control variance at the trajectory level, for parameter posterior accuracy (with sufficient in nonlinear cases).
Algorithmic recursivity ensures scalability and parallelizability over the outer particle population.
7. Practical Limitations and Extensions
NPFs require that the system transition and observation densities ( or ) be computable in closed form for each parameter/state. Their sequential structure and error control make them well-suited to online parameter learning, high-frequency experimental design, and inference in large-scale or non-exchangeable dynamical systems.
A major limitation is the necessity of closed-form tractability at both particle levels; models without tractable densities are not directly amenable to NPF schemes. For very high-dimensional latent or parameter spaces, the method inherits the usual limitations of particle filters; variance- and degeneracy-adaptive schemes can ameliorate but not eliminate the curse of dimensionality. Further, in non-Markovian Feynman-Kac models and long design horizons, forward particle genealogies may still collapse, motivating further research into scalable smoothing and adaptive resampling mechanisms.
NPFs, as well as their Inside-Out and risk-sensitive extensions, provide a general framework for recursive, online, and amortized Bayesian inference and design, with strong theoretical and empirical support for their accuracy and efficiency in high-velocity, high-dimensional, and non-exchangeable sequential learning problems (Crisan et al., 2013, Iqbal et al., 2024, Iqbal et al., 2024).