Adaptive joint distribution learning (2110.04829v5)
Abstract: We develop a new framework for estimating joint probability distributions using tensor product reproducing kernel Hilbert spaces (RKHS). Our framework accommodates a low-dimensional, normalized and positive model of a Radon--Nikodym derivative, which we estimate from sample sizes of up to several millions, alleviating the inherent limitations of RKHS modeling. Well-defined normalized and positive conditional distributions are natural by-products to our approach. Our proposal is fast to compute and accommodates learning problems ranging from prediction to classification. Our theoretical findings are supplemented by favorable numerical results.
- I. Archakov and P. R. Hansen. A new parametrization of correlation matrices. Econometrica, 89(4):1699–1715, 2021. doi: https://doi.org/10.3982/ECTA16910. URL https://onlinelibrary.wiley.com/doi/abs/10.3982/ECTA16910.
- F. Bach and M. Jordan. Kernel independent component analysis. Journal of Machine Learning Research, 3(Jul):1–48, 2002.
- N. Beebe and J. Linderberg. Simplifications in the generation and transformation of two-electron integrals in molecular calculations. International Journal of Quantum Chemistry, 12(4):683–705, 1977.
- A. Berlinet and C. Thomas-Agnan. Reproducing Kernel Hilbert Spaces in Probability and Statistics, pages 1–54. Springer US, Boston, MA, 2004.
- R. M. Dudley. Real analysis and probability, volume 74 of Cambridge Studies in Advanced Mathematics. Cambridge University Press, Cambridge, 2002. ISBN 0-521-00754-2. doi: 10.1017/CBO9780511755347. URL https://doi.org/10.1017/CBO9780511755347. Revised reprint of the 1989 original.
- Stable and Efficient Gaussian Process Calculations. Journal of Machine Learning Research, 10:857–882, 2009.
- Dimensionality reduction for supervised learning with reproducing kernel hilbert spaces. Journal of Machine Learning Research, 5:73–99, dec 2004. ISSN 1532-4435.
- Conditional mean embeddings as regressors. In Proceedings of the 29th International Conference on Machine Learning, pages 1823–1830, New York, 2012. Omnipress.
- On the low-rank approximation by the pivoted Cholesky decomposition. Applied Numerical Mathematics, 62:28–440, 2012.
- N. J. Higham. Accuracy and Stability of Numerical Algorithms. Society for Industrial and Applied Mathematics, Philadelphia, 1996.
- A rigorous theory of conditional mean embeddings. SIAM Journal on Mathematics of Data Science, 2(3):583–606, 2020. doi: 10.1137/19M1305069. URL https://doi.org/10.1137/19M1305069.
- J. B. Lasserre. Moment, Positive Polynomials and Their Applications, volume 1 of Imperial College Press Optimization Series. Imperial College Press, 2010.
- On learning vector-valued functions. Neural Computation, 17(1):177–204, 2005.
- A note on optimizing distributions using kernel mean embeddings. arXiv:2106.09994, 2021.
- Estimating divergence functionals and the likelihood ratio by convex risk minimization. IEEE Transactions on Information Theory, 56(11):5847–5861, 2010. doi: 10.1109/TIT.2010.2068870.
- J. Park and K. Muandet. A measure-theoretic approach to kernel conditional mean embeddings. In Proceedings of the 34th International Conference on Neural Information Processing Systems, NIPS’20, Red Hook, NY, USA, 2020. Curran Associates Inc. ISBN 9781713829546.
- K. Schmüdgen. The Moment Problem. Graduate Texts in Mathematics. Springer International Publishing, 2017. ISBN 9783319645469. URL https://books.google.ch/books?id=nu09DwAAQBAJ.
- Kernel conditional density operators. In S. Chiappa and R. Calandra, editors, Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics (AISTATS) 2020, volume 108 of Proceedings of Machine Learning Research, pages 993–1004, Online, 26–28 Aug 2020. PMLR.
- A hilbert space embedding for distributions. In M. Hutter, R. A. Servedio, and E. Takimoto, editors, Algorithmic Learning Theory, pages 13–31, Berlin, Heidelberg, 2007. Springer Berlin Heidelberg. ISBN 978-3-540-75225-7.
- Hilbert space embeddings of conditional distributions with applications to dynamical systems. In Proceedings of the 26th Annual International Conference on Machine Learning, ICML 2009, pages 961–968, New York, NY, USA, 2009. Association for Computing Machinery. ISBN 9781605585161. doi: 10.1145/1553374.1553497. URL https://doi.org/10.1145/1553374.1553497.
- Kernel embeddings of conditional distributions: A unified kernel framework for nonparametric inference in graphical models. IEEE Signal Processing Magazine, 30(4):98–111, 2013. doi: 10.1109/MSP.2013.2252713.
- C. Williams and M. Seeger. Using the Nyström method to speed up kernel machines. In T. Leen, T. Dietterich, and V. Tresp, editors, Advances in Neural Information Processing Systems, volume 13. MIT Press, 2000.
- J. Zhu and T. Hastie. Kernel logistic regression and the import vector machine. Journal of Computational and Graphical Statistics, 14(1):185–205, 2005.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.