Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
121 tokens/sec
GPT-4o
9 tokens/sec
Gemini 2.5 Pro Pro
47 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

On the best approximation by finite Gaussian mixtures (2404.08913v2)

Published 13 Apr 2024 in math.ST, cs.IT, cs.LG, math.IT, stat.ML, and stat.TH

Abstract: We consider the problem of approximating a general Gaussian location mixture by finite mixtures. The minimum order of finite mixtures that achieve a prescribed accuracy (measured by various $f$-divergences) is determined within constant factors for the family of mixing distributions with compactly support or appropriate assumptions on the tail probability including subgaussian and subexponential. While the upper bound is achieved using the technique of local moment matching, the lower bound is established by relating the best approximation error to the low-rank approximation of certain trigonometric moment matrices, followed by a refined spectral analysis of their minimum eigenvalue. In the case of Gaussian mixing distributions, this result corrects a previous lower bound in [Allerton Conference 48 (2010) 620-628].

Definition Search Book Streamline Icon: https://streamlinehq.com
References (67)
  1. A.R. Barron. Universal approximation bounds for superpositions of a sigmoidal function. IEEE Transactions on Information Theory, 39(3):930–945, 1993.
  2. Small eigenvalues of large Hankel matrices: the indeterminate case. Mathematica Scandinavica, 91:67–81, 1999.
  3. Rate of convergence of the smoothed empirical Wasserstein distance. arXiv:2205.02128, 2022.
  4. Shape of a distribution through the l2subscript𝑙2l_{2}italic_l start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT-Wasserstein distance. Distributions with Given Marginals and Statistical Modelling, pages 51–61, 2002.
  5. Small eigenvalues of large Hankel matrices. Journal of Physics A, 32:7305–7315, 1999.
  6. Y. Chen and D.S. Lubinsky. Smallest eigenvalues of Hankel matrices for exponential weights. Journal of Mathematical Analysis and Applications, 293(2):476–495, 2004.
  7. Asymptotics of smoothed Wasserstein distances. Potential Analysis, 56(4):571–595, 2022.
  8. Imre Csiszár. I𝐼Iitalic_I-divergence geometry of probability distributions and minimization problems. The Annals of Probability, pages 146–158, 1975.
  9. Elements of information theory. John Wiley & Sons, 2 edition, 2006.
  10. A look at Gaussian mixture reduction algorithms. In 14th International Conference on Information Fusion, pages 1–8, 2011.
  11. Nist digital library of mathematical functions. http://dlmf.nist.gov/, Release 1.1.8 of 2022-12-15. F. W. J. Olver, eds.
  12. Adaptive mixture model reduction based on the composite transportation dissimilarity. In 2023 26th International Conference on Information Fusion (FUSION), pages 1–8, 2023.
  13. Optimal estimation of high-dimensional Gaussian location mixtures. The Annals of Statistics, 51(1):62 – 95, 2023.
  14. Paulo Jorge SG Ferreira. Neural networks and approximation by superposition of Gaussians. In 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing, volume 4, pages 3197–3200. IEEE, 1997.
  15. Convergence of smoothed empirical measures with applications to entropy estimation. IEEE Transactions on Information Theory, 66(7):4368–4391, 2020.
  16. Table of integrals series and products. Academic, New York, 2007.
  17. Subhashis Ghosal and Aad W. van der Vaart. Entropies and rates of convergence for maximum likelihood and Bayes estimation for mixtures of normal densities. The Annals of Statistics, 29(5):1233 – 1263, 2001.
  18. Subhashis Ghosal and Aad van der Vaart. Posterior convergence rates of Dirichlet mixtures at smooth densities. The Annals of Statistics, 35(2):697 – 723, 2007.
  19. Subhashis Ghosal and Aad van der Vaart. Posterior convergence rates of dirichlet mixtures at smooth densities. The Annals of Statistics, pages 697–723, 2007.
  20. Calculation of Gauss quadrature rules. Mathematics of Computation, 23(106):221–230, 1969.
  21. Rates of convergence for the Gaussian mixture sieve. The Annals of Statistics, 28(4):1105 – 1127, 2000.
  22. Wenhua Jiang. On general maximum likelihood empirical Bayes estimation of heteroscedastic IID normal means. Electronic Journal of Statistics, 14:2272–2297, 01 2020.
  23. Entropic characterization of optimal rates for learning Gaussian mixtures. In Proceedings of Thirty Sixth Conference on Learning Theory, volume 195 of Proceedings of Machine Learning Research, pages 4296–4335. PMLR, 12–15 Jul 2023.
  24. General maximum likelihood empirical Bayes estimation of normal means. The Annals of Statistics, 37(4):1647 – 1684, 2009.
  25. Moving beyond sub-gaussianity in high-dimensional statistics: applications in covariance estimation and linear regression. Information and Inference A Journal of the IMA, 11:1389–1456, 06 2022.
  26. On the eigen-values of certain Hermitian forms. Journal of Rational Mechanics and Analysis, 2:767–800, 1953.
  27. Convex Functions and Orlicz Spaces. Noordhoff, Groningen, 1961.
  28. I. V. Krasovsky. Some computable Wiener-Hopf determinants and polynomials orthogonal on an arc of the unit circle. arXiv:0310172, 2003.
  29. J. Kiefer and J. Wolfowitz. Consistency of the maximum likelihood estimator in the presence of infinitely many incidental parameters. The Annals of Mathematical Statistics, 27(4):887–906, 12 1956.
  30. B. Laurent and P. Massart. Adaptive estimation of a quadratic functional by model selection. The Annals of Statistics, 28(5):1302 – 1338, 2000.
  31. L. Mirsky. Symmetric gauge functions and unitarily invariant norms. The Quarterly Journal of Mathematics, 11(1):50–59, 01 1960.
  32. On the best approximation by finite gaussian mixtures. In 2023 IEEE International Symposium on Information Theory (ISIT), pages 2619–2624, 2023.
  33. Approximation of probability density functions via location-scale finite mixtures in Lebesgue spaces. Communications in Statistics - Theory and Methods, pages 1–12, May 2022.
  34. On approximations via convolution-defined mixture models. Communications in Statistics - Theory and Methods, 48(16):3945–3955, 2019.
  35. Approximation by finite mixtures of continuous density functions that vanish at infinity. Cogent Mathematics & Statistics, 7(1):1750861, 2020.
  36. J. Pasupathy and R.A. Damodar. The Gaussian Toeplitz matrix. Linear Algebra and its Applications, 171:133–147, 1992.
  37. J. Pestana. Preconditioners for symmetrized Toeplitz and multilevel Toeplitz matrices. SIAM Journal on Matrix Analysis and Applications, 40(3):870–887, 2019.
  38. Self-regularizing property of nonparametric maximum likelihood estimator in mixture models. arXiv:2008.08244, 2020.
  39. Sharp regret bounds for empirical bayes and compound decision problems. arXiv:2211.12692, 2021.
  40. Information theory: from coding to statistical learning. Cambridge University Press, 2024. Available at http://www.stat.yale.edu/~yw562/teaching/itbook-export.pdf.
  41. J. Stoer and R. Bulirsch. Introduction to numerical analysis. Springer-Verlag, New York, 3 edition, 2002.
  42. On the nonparametric maximum likelihood estimator for Gaussian location mixture densities with application to Gaussian denoising. The Annals of Statistics, 48(2):738 – 762, 2020.
  43. Multivariate, heteroscedastic empirical Bayes via nonparametric maximum likelihood. arXiv:2109.03466, 2021.
  44. Barry Simon. Orthogonal polynomials on the unit circle. Part 1: classical theory, volume 54. AMS Colloquium Publications, 01 2005.
  45. A new probabilistic distance metric with application in Gaussian mixture reduction. arXiv:2306.07309, 2023.
  46. Spectral representation of some weighted Hankel matrices and orthogonal polynomials from the Askey scheme. Journal of Mathematical Analysis and Applications, 472(1):483–509, 2019.
  47. Convergence rate of sieve estimates. The Annals of Statistics, 22(2):580 – 615, 1994.
  48. Empirical Bayes estimation: when does g𝑔gitalic_g-modeling beat f𝑓fitalic_f-modeling in theory (and in practice)? arXiv:2211.12692, 2023.
  49. G. Szegö. On some Hermitian forms associated with two given curves of the complex plane. Transactions of the American Mathematical Society, 40:450–461, 1936.
  50. G. Szegö. Orthogonal polynomials. American Mathematical Society, Providence, RI, 4 edition, 1975.
  51. Spectra of multilevel Toeplitz matrices: advanced theory via simple matrix relationships. Linear Algebra and its Applications, 270(1):15–27, 1998.
  52. James Victor Uspensky. Introduction to mathematical probability. McGraw-Hill, 1937.
  53. Aad van der Vaart and Jon A. Wellner. Weak convergence and empirical processes: with applications to statistics. Springer Series in Statistics. Springer-Verlag, New York, 1996.
  54. Roman Vershynin. High-dimensional probability: an introduction with applications in data science. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, 2018.
  55. Sub-weibull distributions: generalizing sub-gaussian and sub-exponential properties to heavier tailed distributions. Stat, 9(1):e318, 2020.
  56. G. N. Watson. The final problem: an account of the mock theta functions. Journal of the London Mathematical Society, s1-11(1):55–80, 1936.
  57. Norbert Wiener. Tauberian theorems. Annals of mathematics, 33(1):1–100, 1932.
  58. Probability inequalities for likelihood ratios and convergence rates of sieve MLEs. The Annals of Statistics, 23(2):339 – 362, 1995.
  59. Functional properties of MMSE. 2010 IEEE International Symposium on Information Theory, pages 1453–1457, 2010.
  60. The impact of constellation cardinality on Gaussian channel capacity. In 2010 48th Annual Allerton Conference on Communication, Control, and Computing (Allerton), pages 620–628, 2010.
  61. Small eigenvalues of large Hankel matrices. Proceedings of the American Mathematical Society, 17(2):338–344, 1966.
  62. Optimal estimation of Gaussian mixtures via denoised method of moments. The Annals of Statistics, 48(4):1981 – 2007, 2020.
  63. Information-theoretic determination of minimax rates of convergence. The Annals of Statistics, 27(5):1564 – 1599, 1999.
  64. Cun-Hui Zhang. Generalized maximum likelihood estimation of normal mixture densities. Statistica Sinica, 19(3):1297–1318, 2009.
  65. Alexei Zhedanov. On some classes of polynomials orthogonal on arcs of the unit circle connected with symmetric orthogonal polynomials on an interval. Journal of Approximation Theory, 94(1):73–106, 1998.
  66. Sharper sub-Weibull concentrations. Mathematics, 10(13), 2022.
  67. Gaussian mixture reduction with composite transportation divergence. IEEE Transactions on Information Theory, 2023.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com