Compressive Mahalanobis Metric Learning Adapts to Intrinsic Dimension (2309.05751v3)
Abstract: Metric learning aims at finding a suitable distance metric over the input space, to improve the performance of distance-based learning algorithms. In high-dimensional settings, it can also serve as dimensionality reduction by imposing a low-rank restriction to the learnt metric. In this paper, we consider the problem of learning a Mahalanobis metric, and instead of training a low-rank metric on high-dimensional data, we use a randomly compressed version of the data to train a full-rank metric in this reduced feature space. We give theoretical guarantees on the error for Mahalanobis metric learning, which depend on the stable dimension of the data support, but not on the ambient dimension. Our bounds make no assumptions aside from i.i.d. data sampling from a bounded support, and automatically tighten when benign geometrical structures are present. An important ingredient is an extension of Gordon's theorem, which may be of independent interest. We also corroborate our findings by numerical experiments.
- Sample complexity of learning Mahalanobis distance metrics. Advances in neural information processing systems, 28, 2015.
- Distance metric learning with application to clustering with side-information. Advances in neural information processing systems, 15, 2002.
- Distance metric learning for large margin nearest neighbor classification. Journal of machine learning research, 10(2), 2009.
- Information-theoretic metric learning. In Proceedings of the 24th international conference on Machine learning, pages 209–216, 2007.
- The curse of dimensionality in data mining and time series prediction. In International work-conference on artificial neural networks, pages 758–770. Springer, 2005.
- The intrinsic dimension of images and its impact on learning. In International Conference on Learning Representations, 2020.
- Dimitris Achlioptas. Database-friendly random projections: Johnson-lindenstrauss with binary coins. Journal of computer and System Sciences, 66(4):671–687, 2003.
- Structure discovery in pac-learningby randomprojections. Machine Learning, 2024, to appear.
- Bounds on the number of measurements for reliable compressive classification. IEEE Transactions on Signal Processing, 64(22):5778–5793, 2016.
- Random projections with asymmetric quantization. Advances in Neural Information Processing Systems, 32, 2019.
- Yehoram Gordon. On milman’s inequality and random subspaces which escape through a mesh in ℝnsuperscriptℝ𝑛\mathbb{R}^{n}blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT. In Geometric Aspects of Functional Analysis: Israel Seminar (GAFA) 1986–87, pages 84–106. Springer, 1988.
- Brian Kulis et al. Metric learning: A survey. Foundations and Trends in Machine Learning, 5(4):287–364, 2013.
- Survey on distance metric learning and dimensionality reduction in data mining. Data mining and knowledge discovery, 29(2):534–564, 2015.
- Non-linear metric learning. Advances in neural information processing systems, 25, 2012.
- Curvilinear distance metric learning. Advances in Neural Information Processing Systems, 32, 2019.
- Deep metric learning: A survey. Symmetry, 11(9):1066, 2019.
- Collaborative metric learning. In Proceedings of the 26th international conference on world wide web, pages 193–201, 2017.
- Is that you? Metric learning approaches for face identification. In IEEE International Conference on Computer Vision, pages 498–505. IEEE, 2009.
- Orthogonality-promoting distance metric learning: Convex relaxation and theoretical analysis. In International Conference on Machine Learning, pages 5403–5412. PMLR, 2018.
- The effect of the intrinsic dimension on the generalization of quadratic classifiers. Advances in Neural Information Processing Systems, 34:21138–21149, 2021.
- Taking compressive sensing to the hardware level: Breaking fundamental radio-frequency hardware performance tradeoffs. IEEE Signal Processing Magazine, 36(2):81–100, 2019.
- Approximate nearest neighbors: towards removing the curse of dimensionality. In Proceedings of the thirtieth annual ACM symposium on Theory of computing, pages 604–613, 1998.
- An elementary proof of a theorem of johnson and lindenstrauss. Random Structures & Algorithms, 22(1):60–65, 2003.
- Roman Vershynin. High-dimensional probability: An introduction with applications in data science, volume 47. Cambridge university press, 2018.
- Afonso S Bandeira. Ten lectures and forty-two open problems in the mathematics of data science, 2015.
- Martin J Wainwright. High-dimensional statistics: A non-asymptotic viewpoint, volume 48. Cambridge University Press, 2019.
- Statistical inference. Cengage Learning, 2 edition, 2002.
- Rademacher and gaussian complexities: Risk bounds and structural results. Journal of Machine Learning Research, 3(Nov):463–482, 2002.
- UCI machine learning repository, 2017.
- Robust structural metric learning. In International conference on machine learning, pages 615–623. PMLR, 2013.