Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
184 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Simple, unified analysis of Johnson-Lindenstrauss with applications (2402.10232v4)

Published 10 Feb 2024 in stat.ML, cs.DS, math.PR, and cs.LG

Abstract: We present a simplified and unified analysis of the Johnson-Lindenstrauss (JL) lemma, a cornerstone of dimensionality reduction for managing high-dimensional data. Our approach simplifies understanding and unifies various constructions under the JL framework, including spherical, binary-coin, sparse JL, Gaussian, and sub-Gaussian models. This unification preserves the intrinsic geometry of data, essential for applications from streaming algorithms to reinforcement learning. We provide the first rigorous proof of the spherical construction's effectiveness and introduce a general class of sub-Gaussian constructions within this simplified framework. Central to our contribution is an innovative extension of the Hanson-Wright inequality to high dimensions, complete with explicit constants. By using simple yet powerful probabilistic tools and analytical techniques, such as an enhanced diagonalization process, our analysis solidifies the theoretical foundation of the JL lemma by removing an independence assumption and extends its practical applicability to contemporary algorithms.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (23)
  1. Dimitris Achlioptas. Database-friendly random projections: Johnson-lindenstrauss with binary coins. Journal of computer and System Sciences, 66(4):671–687, 2003.
  2. Radoslaw Adamczak. A note on the Hanson-Wright inequality for random vectors with dependencies. Electronic Communications in Probability, 20(none):1 – 13, 2015. 10.1214/ECP.v20-3829. URL https://doi.org/10.1214/ECP.v20-3829.
  3. Simple analyses of the sparse johnson-lindenstrauss transform. In 1st Symposium on Simplicity in Algorithms (SOSA 2018). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2018.
  4. An elementary proof of a theorem of johnson and lindenstrauss. Random Structures & Algorithms, 22(1):60–65, 2003.
  5. Hypermodels for exploration. In International Conference on Learning Representations, 2020. URL https://openreview.net/forum?id=ryx6WgStPB.
  6. P Frankl and H Maehara. The johnson-lindenstrauss lemma and the sphericity of some graphs. Journal of Combinatorial Theory, Series B, 44(3):355–362, 1988. ISSN 0095-8956. https://doi.org/10.1016/0095-8956(88)90043-3. URL https://www.sciencedirect.com/science/article/pii/0095895688900433.
  7. A Bound on Tail Probabilities for Quadratic Forms in Independent Random Variables. The Annals of Mathematical Statistics, 42(3):1079 – 1083, 1971. 10.1214/aoms/1177693335. URL https://doi.org/10.1214/aoms/1177693335.
  8. Approximate nearest neighbors: towards removing the curse of dimensionality. In Proceedings of the thirtieth annual ACM symposium on Theory of computing, pages 604–613, 1998.
  9. Optimal bounds for johnson-lindenstrauss transforms and streaming problems with subconstant error. ACM Transactions on Algorithms (TALG), 9(3):1–17, 2013.
  10. Extensions of lipschitz mappings into a hilbert space. In Conference on Modern Analysis and Probability, volume 26, pages 189–206. American Mathematical Society, 1984.
  11. Almost optimal explicit johnson-lindenstrauss families. In Leslie Ann Goldberg, Klaus Jansen, R. Ravi, and José D. P. Rolim, editors, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, pages 628–639, Berlin, Heidelberg, 2011. Springer Berlin Heidelberg. ISBN 978-3-642-22935-0.
  12. Hyperagent: A simple, scalable, efficient and provable reinforcement learning framework for complex environments, 2024. To appear on arXiv preprint.
  13. HyperDQN: A randomized exploration method for deep reinforcement learning. In International Conference on Learning Representations, 2022. URL https://openreview.net/forum?id=X0nrKAXu7g-.
  14. Jiří Matoušek. On variants of the johnson–lindenstrauss lemma. Random Structures & Algorithms, 33(2):142–156, 2008.
  15. Shanmugavelayutham Muthukrishnan et al. Data streams: Algorithms and applications. Foundations and Trends® in Theoretical Computer Science, 1(2):117–236, 2005.
  16. Epistemic neural networks. In Thirty-seventh Conference on Neural Information Processing Systems, 2023. URL https://openreview.net/forum?id=dZqcC1qCmB.
  17. Hanson-Wright inequality and sub-gaussian concentration. Electronic Communications in Probability, 18(none):1 – 9, 2013. 10.1214/ECP.v18-2865. URL https://doi.org/10.1214/ECP.v18-2865.
  18. Maciej Skorski. Bernstein-type bounds for beta distribution. Modern Stochastics: Theory and Applications, 10(2):211–228, 2023.
  19. Roman Vershynin. High-Dimensional Probability: An Introduction with Applications in Data Science. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, 2018.
  20. Martin J. Wainwright. High-Dimensional Statistics: A Non-Asymptotic Viewpoint. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, 2019. 10.1017/9781108627771.
  21. Feature hashing for large scale multitask learning. In Proceedings of the 26th annual international conference on machine learning, pages 1113–1120, 2009.
  22. David P Woodruff et al. Sketching as a tool for numerical linear algebra. Foundations and Trends® in Theoretical Computer Science, 10(1–2):1–157, 2014.
  23. F. T. Wright. A Bound on Tail Probabilities for Quadratic Forms in Independent Random Variables Whose Distributions are not Necessarily Symmetric. The Annals of Probability, 1(6):1068 – 1070, 1973. 10.1214/aop/1176996815. URL https://doi.org/10.1214/aop/1176996815.
Citations (4)

Summary

We haven't generated a summary for this paper yet.