On Outer Bi-Lipschitz Extensions of Linear Johnson-Lindenstrauss Embeddings of Subsets of $\mathbb{R}^N$ (2403.03969v1)
Abstract: The celebrated Johnson-Lindenstrauss lemma states that for all $\varepsilon \in (0,1)$ and finite sets $X \subseteq \mathbb{R}N$ with $n>1$ elements, there exists a matrix $\Phi \in \mathbb{R}{m \times N}$ with $m=\mathcal{O}(\varepsilon{-2}\log n)$ such that [ (1 - \varepsilon) |x-y|_2 \leq |\Phi x-\Phi y|_2 \leq (1+\varepsilon)| x- y|_2 \quad \forall\, x, y \in X.] Herein we consider terminal embedding results which have recently been introduced in the computer science literature as stronger extensions of the Johnson-Lindenstrauss lemma for finite sets. After a short survey of this relatively recent line of work, we extend the theory of terminal embeddings to hold for arbitrary (e.g., infinite) subsets $X \subseteq \mathbb{R}N$, and then specialize our generalized results to the case where $X$ is a low-dimensional compact submanifold of $\mathbb{R}N$. In particular, we prove the following generalization of the Johnson-Lindenstrauss lemma: For all $\varepsilon \in (0,1)$ and $X\subseteq\mathbb{R}N$, there exists a terminal embedding $f: \mathbb{R}N \longrightarrow \mathbb{R}{m}$ such that $$(1 - \varepsilon) | x - y |_2 \leq \left| f(x) - f(y) \right|_2 \leq (1 + \varepsilon) | x - y |_2 \quad \forall \, x \in X ~{\rm and}~ \forall \, y \in \mathbb{R}N.$$ Crucially, we show that the dimension $m$ of the range of $f$ above is optimal up to multiplicative constants, satisfying $m=\mathcal{O}(\varepsilon{-2} \omega2(S_X))$, where $\omega(S_X)$ is the Gaussian width of the set of unit secants of $X$, $S_X=\overline{{(x-y)/|x-y|_2 \colon x \neq y \in X}}$. Furthermore, our proofs are constructive and yield algorithms for computing a general class of terminal embeddings $f$, an instance of which is demonstrated herein to allow for more accurate compressive nearest neighbor classification than standard linear Johnson-Lindenstrauss embeddings do in practice.
- Terminal embeddings in sublinear time. In 2021 IEEE 62nd Annual Symposium on Foundations of Computer Science—FOCS 2021, pages 1209–1216. IEEE Computer Soc., Los Alamitos, CA, [2022] ©2022.
- Compressed sensing and best k𝑘kitalic_k-term approximation. J. Amer. Math. Soc., 22(1):211–231, 2009.
- The smashed filter for compressive classification and target recognition - art. no. 64980h. Proceedings of SPIE, 6498, 02 2007.
- Li Deng. The mnist database of handwritten digit images for machine learning research [best of the web]. IEEE Signal Processing Magazine, 29:141–142, 2012.
- Terminal embeddings. Theoretical Computer Science, 697:1–36, 2017.
- Herbert Federer. Curvature measures. Trans. Amer. Math. Soc., 93:418–491, 1959.
- CVX: Matlab software for disciplined convex programming, version 2.1. http://cvxr.com/cvx, March 2014.
- Graph implementations for nonsmooth convex programs. In Recent advances in learning and control, volume 371 of Lect. Notes Control Inf. Sci., pages 95–110. Springer, London, 2008.
- Lower bounds on the low-distortion embedding dimension of submanifolds of ℝnsuperscriptℝ𝑛\mathbb{R}^{n}blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT. Appl. Comput. Harmon. Anal., 65:170–180, 2023.
- On Fast Johnson-Lindenstrauss Embeddings of Compact Submanifolds of ℝNsuperscriptℝ𝑁\mathbb{R}^{N}blackboard_R start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT with Boundary. Discrete Comput. Geom., 71(2):498–555, 2024.
- Extensions of lipschitz maps into a hilbert space. Contemporary Mathematics, 26:189–206, 01 1984.
- M. Kirszbraun. Über die zusammenziehende und lipschitzsche transformationen. Fundamenta Mathematicae, 22(1):77–108, 1934.
- Optimality of the Johnson-Lindenstrauss lemma. In 58th Annual IEEE Symposium on Foundations of Computer Science—FOCS 2017, pages 633–638. IEEE Computer Soc., Los Alamitos, CA, 2017.
- Nonlinear dimension reduction via outer bi-Lipschitz extensions. In STOC’18—Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing, pages 1088–1101. ACM, New York, 2018.
- Optimal terminal dimensionality reduction in Euclidean space. In STOC’19—Proceedings of the 51st Annual ACM SIGACT Symposium on Theory of Computing, pages 1064–1069. ACM, New York, 2019.
- Columbia object image library (coil100). 1996.
- Roman Vershynin. High-dimensional probability, volume 47 of Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, Cambridge, 2018. An introduction with applications in data science, With a foreword by Sara van de Geer.
- John von Neumann. Zur theorie der gesellschaftsspiele. Mathematische Annalen, 100:295–320, 1928.