Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
167 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Weighted least-squares approximation with determinantal point processes and generalized volume sampling (2312.14057v3)

Published 21 Dec 2023 in math.NA, cs.LG, cs.NA, math.ST, and stat.TH

Abstract: We consider the problem of approximating a function from $L2$ by an element of a given $m$-dimensional space $V_m$, associated with some feature map $\varphi$, using evaluations of the function at random points $x_1,\dots,x_n$. After recalling some results on optimal weighted least-squares using independent and identically distributed points, we consider weighted least-squares using projection determinantal point processes (DPP) or volume sampling. These distributions introduce dependence between the points that promotes diversity in the selected features $\varphi(x_i)$. We first provide a generalized version of volume-rescaled sampling yielding quasi-optimality results in expectation with a number of samples $n = O(m\log(m))$, that means that the expected $L2$ error is bounded by a constant times the best approximation error in $L2$. Also, further assuming that the function is in some normed vector space $H$ continuously embedded in $L2$, we further prove that the approximation is almost surely bounded by the best approximation error measured in the $H$-norm. This includes the cases of functions from $L\infty$ or reproducing kernel Hilbert spaces. Finally, we present an alternative strategy consisting in using independent repetitions of projection DPP (or volume sampling), yielding similar error bounds as with i.i.d. or volume sampling, but in practice with a much lower number of samples. Numerical experiments illustrate the performance of the different strategies.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (22)
  1. Faster subset selection for matrices and applications. SIAM Journal on Matrix Analysis and Applications, 34(4):1464–1499, 2013.
  2. Constructive subsampling of finite frames with applications in optimal function recovery. Applied and Computational Harmonic Analysis, 65:209–248, 2023.
  3. Reproducing kernel Hilbert spaces in probability and statistics. Springer Science & Business Media, 2011.
  4. Convergence rates for greedy algorithms in reduced basis methods. SIAM journal on mathematical analysis, 43(3):1457–1472, 2011.
  5. Computing Multivariate Fekete and Leja Points by Numerical Linear Algebra. SIAM J. Numer. Anal., November 2010.
  6. A. Cohen and G. Migliorati. Optimal weighted least-squares methods. SMAI Journal of Computational Mathematics, 3:181–203, 2017.
  7. Unbiased estimators for random design regression. The Journal of Machine Learning Research, 23(1):7539–7584, 2022.
  8. Matrix approximation and projective clustering via volume sampling. Theory of Computing, 2(1):225–247, 2006.
  9. Optimal pointwise sampling for l2 approximation. Journal of Complexity, 68:101602, 2022.
  10. A sharp upper bound for sampling numbers in l2. Applied and Computational Harmonic Analysis, 63:113–134, 2023.
  11. Efficient rectangular maximal-volume algorithm for rating elicitation in collaborative filtering. In 2016 IEEE 16th International Conference on Data Mining (ICDM), pages 141–150. IEEE, 2016.
  12. The maximal-volume concept in approximation by low-rank matrices. Contemporary Mathematics, 280:47–52, 2001.
  13. Boosted optimal weighted least-squares. Mathematics of Computation, 91(335):1281–1315, 2022.
  14. Kernel-based interpolation at approximate Fekete points. Numer. Algorithms, 87(1):445–468, May 2021.
  15. Determinantal point process models and statistical inference. Journal of the Royal Statistical Society. Series B (Statistical Methodology), 77(4):853–877, 2015.
  16. A general multipurpose interpolation procedure: the magic points. CPAA, 8(1):383–404, September 2008.
  17. Interlacing families ii: Mixed characteristic polynomials and the kadison—singer problem. Annals of Mathematics, pages 327–350, 2015.
  18. Exponential frames on unbounded sets. Proceedings of the American Mathematical Society, 144(1):109–118, 2016.
  19. On proportional volume sampling for experimental design in general spaces. Statistics and Computing, 33(1):29, 2022.
  20. Friedrich Pukelsheim. Optimal design of experiments. SIAM, 2006.
  21. On the power of iid information for linear approximation. arXiv, October 2023.
  22. Joel A. Tropp. User-friendly tail bounds for sums of random matrices. Foundations of computational mathematics, 12(4):389–434, 2012.
Citations (3)

Summary

We haven't generated a summary for this paper yet.