Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
10 tokens/sec
GPT-4o
12 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Recursive Learning of Asymptotic Variational Objectives (2411.02217v1)

Published 4 Nov 2024 in stat.ML, cs.LG, and stat.CO

Abstract: General state-space models (SSMs) are widely used in statistical machine learning and are among the most classical generative models for sequential time-series data. SSMs, comprising latent Markovian states, can be subjected to variational inference (VI), but standard VI methods like the importance-weighted autoencoder (IWAE) lack functionality for streaming data. To enable online VI in SSMs when the observations are received in real time, we propose maximising an IWAE-type variational lower bound on the asymptotic contrast function, rather than the standard IWAE ELBO, using stochastic approximation. Unlike the recursive maximum likelihood method, which directly maximises the asymptotic contrast, our approach, called online sequential IWAE (OSIWAE), allows for online learning of both model parameters and a Markovian recognition model for inferring latent states. By approximating filter state posteriors and their derivatives using sequential Monte Carlo (SMC) methods, we create a particle-based framework for online VI in SSMs. This approach is more theoretically well-founded than recently proposed online variational SMC methods. We provide rigorous theoretical results on the learning objective and a numerical study demonstrating the method's efficiency in learning model parameters and particle proposal kernels.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (48)
  1. Exponential stability for nonlinear filtering. Ann. Inst. H. Poincaré Probab. Statist., 33(6):697–725.
  2. Deep generative modelling: A comparative review of vaes, gans, normalizing flows, energy-based and autoregressive models. IEEE transactions on pattern analysis and machine intelligence, 44(11):7327–7347.
  3. Breiman, L. (1960). The strong law of large numbers for a class of Markov chains. The Annals of Mathematical Statistics, 31(3):801–803.
  4. Importance weighted autoencoders. In International Conference on Learning Representations.
  5. Inference in Hidden Markov Models. Springer, New York.
  6. An introduction to sequential Monte Carlo methods. Springer, New York.
  7. A recurrent latent variable model for sequential data. In Advances in Neural Information Processing Systems, volume 28, pages 2980 – 2988. Curran Associates, Inc.
  8. Adaptive methods for sequential importance sampling with application to state space models. Stat. Comput., 18(4):461–480.
  9. On backward smoothing algorithms. The Annals of Statistics, 51(5):2145–2169.
  10. Alpha-divergence variational inference meets importance weighted auto-encoders: Methodology and asymptotics. Journal of Machine Learning Research, 24(243):1–83.
  11. A backward interpretation of Feynman-Kac formulae. ESAIM: Mathematical Modelling and Numerical Analysis, 44:947–975.
  12. Uniform stability of a particle approximation of the optimal filter derivative. SIAM Journal on Control and Optimization, 53(3):1278–1304.
  13. A solution to the simultaneous localization and map building (SLAM) problem. IEEE Transactions on Robotics and Automation, 17(3):229–241.
  14. Sequential Monte Carlo smoothing for general state space hidden Markov models. Ann. Appl. Probab., 21(6):1201–2145.
  15. Asymptotics of the maximum likelihood estimator for general hidden Markov models. Bernoulli, 7(3):381–420.
  16. Sequential Monte Carlo Methods in Practice. Springer, New York.
  17. On sequential Monte-Carlo sampling methods for Bayesian filtering. Stat. Comput., 10:197–208.
  18. Differentiable samplers for deep latent variable models. Philosophical Transactions of the Royal Society A, 381(2247):20220147.
  19. On importance-weighted autoencoders. arXiv preprint arXiv:1907.10477.
  20. Joint probabilistic modeling of single-cell multi-omic data with totalVI. Nature methods, 18(3):272–282.
  21. Novel approach to nonlinear/non-Gaussian Bayesian state estimation. IEE Proc. F, Radar Signal Process., 140:107–113.
  22. Non-asymptotic analysis of biased stochastic approximation scheme. In Conference on Learning Theory, pages 1944–1974. PMLR.
  23. Adam: A method for stochastic optimization. In International Conference on Learning Representations.
  24. Auto-encoding variational Bayes. In International Conference on Learning Representations.
  25. Kitagawa, G. (1987). Non-Gaussian state space modeling of nonstationary time series. J. Am. Statist. Assoc., 82(400):1023–1063.
  26. Monte Carlo smoothing and self-organising state-space model. In Sequential Monte Carlo methods in practice, Stat. Eng. Inf. Sci., pages 177–195. Springer, New York.
  27. Auto-encoding sequential Monte Carlo. In International Conference on Learning Representations.
  28. Recursive estimation in HMMs. In Proc. IEEE Conf. Decis. Control, pages 3468–3473.
  29. Stability and uniform approximation of nonlinear filters using the Hilbert metric and application to particle filters. Ann. Appl. Probab., 14:144–187.
  30. Filtering variational objectives. Advances in Neural Information Processing Systems, 30.
  31. Online variational sequential Monte Carlo. In Proceedings of the 41st International Conference on Machine Learning, volume 235 of Proceedings of Machine Learning Research, pages 35039–35062. PMLR.
  32. Fast and numerically stable particle-based online additive smoothing: The adasmooth algorithm. Journal of the American Statistical Association, 119(545):356–367.
  33. Markov Chains and Stochastic Stability. Cambridge University Press, London.
  34. Variational sequential Monte Carlo. In International conference on artificial intelligence and statistics, pages 968–977. PMLR.
  35. Nowozin, S. (2018). Debiasing evidence approximations: On importance-weighted autoencoders and jackknife variational inference. In International conference on learning representations.
  36. Sequential Monte Carlo smoothing with application to parameter estimation in non-linear state space models. Bernoulli, 14(1):155–179.
  37. Efficient particle-based online smoothing in general hidden Markov models: The PaRIS algorithm. Bernoulli, 23(3):1951–1996.
  38. Particle-based online estimation of tangent filters with application to parameter estimation in nonlinear state-space models. Annals of the Institute of Statistical Mathematics, 72:545–576.
  39. Particle approximations of the score and observed information matrix in state space models with application to parameter estimation. Biometrika, 98(1):65–80.
  40. Tighter variational bounds are not necessarily better. In Proceedings of the 35th International Conference on Machine Learning, volume 80 of Proceedings of Machine Learning Research, pages 4277–4285. PMLR.
  41. A stochastic approximation method. Ann. Math. Statist., 22:400–407.
  42. Sticking the landing: Simple, lower-variance gradient estimators for variational inference. In Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc.
  43. Exponential forgetting and geometric ergodicity for optimal filtering in general state-space models. Stochastic Process. Appl., 115(8):1408–1436.
  44. Asymptotic properties of recursive particle maximum likelihood estimation. IEEE Transactions on Information Theory, 67(3):1825–1848.
  45. Probabilistic Robotics. MIT Press.
  46. Doubly reparameterized gradient estimators for monte carlo objectives. In International Conference on Learning Representations.
  47. Multi-sample training for neural image compression. Advances in Neural Information Processing Systems, 35:1502–1515.
  48. A survey of autoencoder-based recommender systems. Frontiers of Computer Science, 14:430–450.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets