Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Measure transport with kernel mean embeddings (2401.12967v2)

Published 23 Jan 2024 in math.ST, cs.NA, math.NA, stat.ME, and stat.TH

Abstract: Kalman filters constitute a scalable and robust methodology for approximate Bayesian inference, matching first and second order moments of the target posterior. To improve the accuracy in nonlinear and non-Gaussian settings, we extend this principle to include more or different characteristics, based on kernel mean embeddings (KMEs) of probability measures into reproducing kernel Hilbert spaces. Focusing on the continuous-time setting, we develop a family of interacting particle systems (termed $\textit{KME-dynamics}$) that bridge between prior and posterior, and that include the Kalman-Bucy filter as a special case. KME-dynamics does not require the score of the target, but rather estimates the score implicitly and intrinsically, and we develop links to score-based generative modeling and importance reweighting. A variant of KME-dynamics has recently been derived from an optimal transport and Fisher-Rao gradient flow perspective by Maurais and Marzouk, and we expose further connections to (kernelised) diffusion maps, leading to a variational formulation of regression type. Finally, we conduct numerical experiments on toy examples and the Lorenz 63 and 96 models, comparing our results against the ensemble Kalman filter and the mapping particle filter (Pulido and van Leeuwen, 2019, J. Comput. Phys.). Our experiments show particular promise for a hybrid modification (called Kalman-adjusted KME-dynamics).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (58)
  1. A. Bain and D. Crisan. Fundamentals of stochastic filtering, volume 3. Springer, 2009.
  2. K. Bergemann and S. Reich. A localization technique for ensemble Kalman filters. Quarterly Journal of the Royal Meteorological Society: A journal of the atmospheric sciences, applied meteorology and physical oceanography, 136(648):701–707, 2010.
  3. K. Bergemann and S. Reich. A mollified ensemble Kalman filter. Quarterly Journal of the Royal Meteorological Society, 136(651):1636–1643, 2010.
  4. Hilbert space embeddings of predictive state representations. In Proceedings of the Twenty-Ninth Conference on Uncertainty in Artificial Intelligence, UAI 2013, Bellevue, WA, USA, August 11-15, 2013. AUAI Press, 2013.
  5. Ensemble Kalman methods: A mean field perspective. arXiv preprint arXiv:2209.11371, 2022.
  6. Gradient flows for sampling: Mean-field models, gaussian approximations and affine invariance. arXiv preprint arXiv:2302.11024, 2023.
  7. Rough McKean–Vlasov dynamics for robust ensemble Kalman filtering. The Annals of Applied Probability, 33(6B):5693–5752, 2023.
  8. R. R. Coifman and S. Lafon. Diffusion maps. Applied and computational harmonic analysis, 21(1):5–30, 2006.
  9. On the geometry of Stein variational gradient descent. Journal of Machine Learning Research, 24:1–39, 2023.
  10. Data assimilation fundamentals: A unified formulation of the state and parameter estimation problem. Springer Nature, 2022.
  11. A. Figalli and F. Glaudo. An invitation to optimal transport, Wasserstein distances, and gradient flows. EMS press, 2021.
  12. Dimensionality reduction for supervised learning with reproducing kernel Hilbert spaces. Journal of Machine Learning Research, 5(Jan):73–99, 2004.
  13. Kernel measures of conditional dependence. Advances in neural information processing systems, 20, 2007.
  14. Kernel Bayes’ rule: Bayesian inference with positive definite kernels. The Journal of Machine Learning Research, 14(1):3753–3783, 2013.
  15. Affine invariant interacting langevin dynamics for bayesian inference. SIAM Journal on Applied Dynamical Systems, 19(3):1633–1658, 2020.
  16. The kernel Kalman rule—efficient nonparametric inference with recursive least squares. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 31, 2017.
  17. The kernel Kalman rule: Efficient nonparametric inference by recursive least-squares and subspace projections. Machine Learning, 108(12):2113–2157, 2019.
  18. J. Goodman and J. Weare. Ensemble samplers with affine invariance. Communications in applied mathematics and computational science, 5(1):65–80, 2010.
  19. Conditional mean embeddings as regressors. In Proceedings of the 29th International Conference on Machine Learning, ICML 2012, Edinburgh, Scotland, UK, June 26 - July 1, 2012, 2012.
  20. Gibbs flow for approximate transport with applications to Bayesian computation. Journal of the Royal Statistical Society Series B: Statistical Methodology, 83(1):156–187, 2021.
  21. Gaussian processes and kernel methods: A review on connections and equivalences. arXiv preprint arXiv:1807.02582, 2018.
  22. A. Kirsch et al. An introduction to the mathematical theory of inverse problems, volume 120. Springer, 2011.
  23. A rigorous theory of conditional mean embeddings. SIAM Journal on Mathematics of Data Science, 2(3):583–606, 2020.
  24. M. Kuang and E. G. Tabak. Sample-based optimal transport and barycenter problems. Communications on Pure and Applied Mathematics, 72(8):1581–1630, 2019.
  25. Data assimilation: A mathematical introduction. Springer, 2015.
  26. Q. Liu and D. Wang. Stein variational gradient descent: A general purpose Bayesian inference algorithm. Advances in neural information processing systems, 29, 2016.
  27. S. Livingstone and M. Girolami. Information-geometric markov chain monte carlo methods using diffusions. Entropy, 16(6):3074–3102, 2014.
  28. A. Maurais and Y. Marzouk. Adaptive algorithms for continuous-time transport: Homotopy-driven sampling and a new interacting particle system. In NeurIPS 2023 Workshop Optimal Transport and Machine Learning, 2023.
  29. A. Maurais and Y. Marzouk. Sampling in unit time with kernel Fisher-Rao flow. arXiv preprint arXiv:2401.03892, 2024.
  30. An isserlis’ theorem for mixed gaussian variables: Application to the auto-bispectral density. Journal of Statistical Physics, 136:89–102, 2009.
  31. Kernel mean embedding of distributions: A review and beyond. Foundations and Trends® in Machine Learning, 10(1-2):1–141, 2017.
  32. Hilbert space embeddings of POMDPs. In N. de Freitas and K. P. Murphy, editors, Proceedings of the Twenty-Eighth Conference on Uncertainty in Artificial Intelligence, Catalina Island, CA, USA, August 14-18, 2012, pages 644–653. AUAI Press, 2012.
  33. N. Nüsken and D. M. Renger. Stein variational gradient descent: Many-particle and long-time asymptotics. Foundations of Data Science, 5(3):286–320, 2023.
  34. McKean–Vlasov SDEs in nonlinear filtering. SIAM Journal on Control and Optimization, 59(6):4188–4215, 2021.
  35. S. Pathiraja and W. Stannat. Analysis of the feedback particle filter with diffusion map based approximation of the gain. arXiv preprint arXiv:2109.02761, 2021.
  36. G. A. Pavliotis. Stochastic processes and applications. Springer, 2016.
  37. L. Pillaud-Vivien and F. Bach. Kernelized diffusion maps. arXiv preprint arXiv:2302.06757, 2023.
  38. Sequential Monte Carlo with kernel embedded mappings: The mapping particle filter. Journal of Computational Physics, 396:400–415, 2019.
  39. S. Reich. A dynamical systems framework for intermittent data assimilation. BIT Numerical Mathematics, 51:235–249, 2011.
  40. S. Reich and C. Cotter. Probabilistic forecasting and Bayesian data assimilation. Cambridge University Press, 2015.
  41. S. Reich and C. J. Cotter. Ensemble filter techniques for intermittent data assimilation. Large Scale Inverse Problems: Computational Methods and Applications in the Earth Sciences, 13:91–134, 2013.
  42. A Hilbert space embedding for distributions. In International conference on algorithmic learning theory, pages 13–31. Springer, 2007.
  43. A. J. Smola and B. Schölkopf. Learning with kernels, volume 4. Citeseer, 1998.
  44. Hilbert space embeddings of hidden Markov models. In J. Fürnkranz and T. Joachims, editors, Proceedings of the 27th International Conference on Machine Learning (ICML-10), June 21-24, 2010, Haifa, Israel, pages 991–998. Omnipress, 2010.
  45. Kernel embeddings of conditional distributions: A unified kernel framework for nonparametric inference in graphical models. IEEE Signal Processing Magazine, 30(4):98–111, 2013.
  46. Hilbert space embeddings of conditional distributions with applications to dynamical systems. In Proceedings of the 26th Annual International Conference on Machine Learning, pages 961–968, 2009.
  47. Universality, characteristic kernels and RKHS embedding of measures. Journal of Machine Learning Research, 12(7), 2011.
  48. Hilbert space embeddings and metrics on probability measures. The Journal of Machine Learning Research, 11:1517–1561, 2010.
  49. I. Steinwart and A. Christmann. Support vector machines. Springer Science & Business Media, 2008.
  50. p-kernel Stein variational gradient descent for data assimilation and history matching. Mathematical Geosciences, 53(3):375–393, 2021.
  51. Adaptive kernel Kalman filter. IEEE Transactions on Signal Processing, 71:713–726, 2023.
  52. S. Syed. Non-reversible parallel tempering on optimized paths. PhD thesis, University of British Columbia, 2022.
  53. Parallel tempering on optimized paths. In International Conference on Machine Learning, pages 10033–10042. PMLR, 2021.
  54. The Kalman filter and its modern extensions for the continuous-time nonlinear filtering problem. Journal of Dynamic Systems, Measurement, and Control, 140(3):030904, 2018.
  55. Diffusion map-based algorithm for gain function approximation in the feedback particle filter. SIAM/ASA Journal on Uncertainty Quantification, 8(3):1090–1117, 2020.
  56. M. J. Wainwright. High-dimensional statistics: A non-asymptotic viewpoint, volume 48. Cambridge university press, 2019.
  57. Importance weighting approach in kernel Bayes’ rule. arXiv preprint arXiv:2202.02474, 2022.
  58. Feedback particle filter. IEEE transactions on Automatic control, 58(10):2465–2480, 2013.
Citations (5)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com