Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
126 tokens/sec
GPT-4o
28 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Dimension-Accuracy Tradeoffs in Contrastive Embeddings for Triplets, Terminals & Top-k Nearest Neighbors (2312.13490v2)

Published 20 Dec 2023 in cs.DS

Abstract: Metric embeddings traditionally study how to map $n$ items to a target metric space such that distance lengths are not heavily distorted; but what if we only care to preserve the relative order of the distances (and not their length)? In this paper, we are motivated by the following basic question: given triplet comparisons of the form ``item $i$ is closer to item $j$ than to item $k$,'' can we find low-dimensional Euclidean representations for the $n$ items that respect those distance comparisons? Such order-preserving embeddings naturally arise in important applications and have been studied since the 1950s, under the name of ordinal or non-metric embeddings. Our main results are: 1. Nearly-Tight Bounds on Triplet Dimension: We introduce the natural concept of triplet dimension of a dataset, and surprisingly, we show that in order for an ordinal embedding to be triplet-preserving, its dimension needs to grow as $\frac n2$ in the worst case. This is optimal (up to constant) as $n-1$ dimensions always suffice. 2. Tradeoffs for Dimension vs (Ordinal) Relaxation: We then relax the requirement that every triplet should be exactly preserved and present almost tight lower bounds for the maximum ratio between distances whose relative order was inverted by the embedding; this ratio is known as (ordinal) relaxation in the literature and serves as a counterpart to (metric) distortion. 3. New Bounds on Terminal and Top-$k$-NNs Embeddings: Going beyond triplets, we then study two well-motivated scenarios where we care about preserving specific sets of distances (not necessarily triplets). The first scenario is Terminal Ordinal Embeddings and the second scenario is top-$k$-NNs Ordinal Embeddings. To the best of our knowledge, these are some of the first tradeoffs on triplet-preserving ordinal embeddings and the first study of Terminal and Top-$k$-NNs Ordinal Embeddings.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (65)
  1. Local embeddings of metric spaces. In Proceedings of the thirty-ninth annual ACM Symposium on Theory of Computing, pages 631–640, 2007.
  2. Generalized non-metric multidimensional scaling. In Artificial Intelligence and Statistics, pages 11–18. PMLR, 2007.
  3. Geometrical realization of set systems and probabilistic communication complexity. In 26th Annual Symposium on Foundations of Computer Science (sfcs 1985), pages 277–280. IEEE, 1985.
  4. Ordinal embeddings of minimum relaxation: general properties, trees, and ultrametrics. ACM Transactions on Algorithms (TALG), 4(4):1–21, 2008.
  5. Hierarchical clustering: A 0.585 revenue approximation. In Conference on Learning Theory, pages 153–162. PMLR, 2020.
  6. Optimal sample complexity of contrastive learning. arXiv preprint arXiv:2312.00379, 2023.
  7. A. Andoni and P. Indyk. Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. Communications of the ACM, 51(1):117–122, 2008.
  8. Ordinal embedding: Approximation algorithms and dimensionality reduction. In Approximation, Randomization and Combinatorial Optimization. Algorithms and Techniques, pages 21–34. Springer, 2008.
  9. M. Belkin and P. Niyogi. Laplacian eigenmaps for dimensionality reduction and data representation. Neural computation, 15(6):1373–1396, 2003.
  10. Y. Bilu and N. Linial. Monotone maps, sphericity and bounded second eigenvalue. Journal of Combinatorial Theory, Series B, 95(2):283–299, 2005.
  11. J. Bourgain. On lipschitz embedding of finite metric spaces in hilbert space. Israel Journal of Mathematics, 52(1):46–52, 1985.
  12. New results on optimizing rooted triplets consistency. Discrete Applied Mathematics, 158(11):1136–1147, 2010.
  13. From trees to continuous embeddings and back: Hyperbolic hierarchical clustering. Advances in Neural Information Processing Systems, 33:15065–15076, 2020.
  14. M. Charikar and V. Chatziafratis. Approximate hierarchical clustering via sparsest cut and spreading metrics. In Proceedings of the Twenty-Eighth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 841–854. SIAM, 2017.
  15. Hierarchical clustering better than average-linkage. In Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 2291–2304. SIAM, 2019a.
  16. Hierarchical clustering for euclidean data. In The 22nd International Conference on Artificial Intelligence and Statistics, pages 2721–2730. PMLR, 2019b.
  17. V. Chatziafratis and K. Makarychev. Triplet reconstruction and all other phylogenetic csps are approximation resistant. In Foundations of Computer Science (FOCS), 2023.
  18. Hierarchical clustering with structural constraints. In International conference on machine learning, pages 774–783. PMLR, 2018.
  19. Bisect and conquer: Hierarchical clustering via max-uncut bisection. In International Conference on Artificial Intelligence and Statistics, pages 3121–3132. PMLR, 2020.
  20. Maximizing agreements for ranking, clustering and hierarchical clustering via max-cut. In International Conference on Artificial Intelligence and Statistics, pages 1657–1665. PMLR, 2021.
  21. Y. Cherapanamjeri and J. Nelson. Terminal embeddings in sublinear time. In 2021 IEEE 62nd Annual Symposium on Foundations of Computer Science (FOCS), pages 1209–1216. IEEE, 2022.
  22. Hierarchical clustering: Objective functions and algorithms. Journal of the ACM (JACM), 66(4):1–42, 2019.
  23. Monotone mapping of similarities into a general metric space. Journal of Mathematical Psychology, 11(4):335–363, 1974.
  24. S. Dasgupta. A cost function for similarity-based hierarchical clustering. In Proceedings of the forty-eighth annual ACM symposium on Theory of Computing, pages 118–127, 2016.
  25. Terminal embeddings. Theoretical Computer Science, 697:1–36, 2017.
  26. E. Emamjomeh-Zadeh and D. Kempe. Adaptive hierarchical clustering using ordinal queries. In Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 415–429. SIAM, 2018.
  27. P. Erdos and H. Sachs. Regukre graphen gegebener taillenweite mit minimaler knotenzahl. Wittenberg Math.-Natur. Reihe 12, 251–257, 1963.
  28. Learning lines with ordinal constraints. arXiv preprint arXiv:2004.13202, 2020.
  29. Landmark ordinal embedding. Advances in Neural Information Processing Systems, 32, 2019.
  30. Foundations of comparison-based hierarchical clustering. Advances in neural information processing systems, 32, 2019.
  31. Beating the random ordering is hard: Inapproximability of maximum acyclic subgraph. In 2008 49th Annual IEEE Symposium on Foundations of Computer Science, pages 573–582. IEEE, 2008.
  32. Estimation of perceptual scales using ordinal embedding. Journal of vision, 20(9):14–14, 2020.
  33. P. Indyk and A. Naor. Nearest-neighbor-preserving embeddings. ACM Transactions on Algorithms (TALG), 3(3):31–es, 2007.
  34. 8: low-distortion embeddings of finite metric spaces. In Handbook of discrete and computational geometry, pages 211–231. Chapman and Hall/CRC, 2017.
  35. Finite sample prediction and recovery bounds for ordinal embedding. Advances in neural information processing systems, 29, 2016.
  36. Low-dimensional embedding using adaptively selected ordinal data. In 2011 49th Annual Allerton Conference on Communication, Control, and Computing (Allerton), pages 1077–1084. IEEE, 2011.
  37. Extensions of lipschitz maps into banach spaces. Israel Journal of Mathematics, 54(2):129–138, 1986.
  38. M. Kleindessner and U. Luxburg. Uniqueness of ordinal embedding. In Conference on Learning Theory, pages 40–67. PMLR, 2014.
  39. M. Kleindessner and U. von Luxburg. Kernel functions based on triplet comparisons. Advances in neural information processing systems, 30, 2017.
  40. R. Korlakai Vinayak and B. Hassibi. Crowdsourced clustering: Querying edges vs triangles. Advances in Neural Information Processing Systems, 29, 2016.
  41. J. B. Kruskal. Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis. Psychometrika, 29(1):1–27, 1964a.
  42. J. B. Kruskal. Nonmetric multidimensional scaling: a numerical method. Psychometrika, 29(2):115–129, 1964b.
  43. Uncertainty estimates for ordinal embeddings. arXiv preprint arXiv:1906.11655, 2019.
  44. Nonlinear dimension reduction via outer bi-lipschitz extensions. In Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing, pages 1088–1101, 2018.
  45. J. Matoušek. Lecture notes on metric embeddings. Technical report, Technical report, ETH Zürich, 2013.
  46. Gradient-based hierarchical clustering using continuous representations of trees in hyperbolic space. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 714–722, 2019.
  47. B. Moseley and J. R. Wang. Approximation bounds for hierarchical clustering: Average linkage, bisecting k-means, and local search. Journal of Machine Learning Research, 24(1):1–36, 2023.
  48. S. Narayanan and J. Nelson. Optimal terminal dimensionality reduction in euclidean space. In Proceedings of the 51st Annual ACM SIGACT Symposium on Theory of Computing, pages 1064–1069, 2019.
  49. Objective-based hierarchical clustering of deep embedding vectors. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 9055–9063, 2021.
  50. Geometrical embeddings of graphs. Discrete Mathematics, 74(3):291–319, 1989.
  51. N. Sauer. On the existence of regular n-graphs with given girth. Journal of Combinatorial Theory, 9(2):144–147, 1970.
  52. Understanding contrastive learning requires incorporating inductive biases. In International Conference on Machine Learning, pages 19250–19286. PMLR, 2022.
  53. G. Schechtman and A. Shraibman. Lower bounds for local versions of dimension reductions. Discrete & Computational Geometry, 41(2):273–283, 2009.
  54. M. Schultz and T. Joachims. Learning a distance metric from relative comparisons. Advances in neural information processing systems, 16, 2003.
  55. R. N. Shepard. The analysis of proximities: multidimensional scaling with an unknown distance function. i. Psychometrika, 27(2):125–140, 1962.
  56. R. N. Shepard. Representation of structure in similarity data: Problems and prospects. Psychometrika, 39(4):373–421, 1974.
  57. N. A. Smith and J. Eisner. Contrastive estimation: Training log-linear models on unlabeled data. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05), pages 354–362, 2005.
  58. Adaptively learning the crowd kernel. 28th International Conference on Machine Learning (ICML), 2011.
  59. Y. Terada and U. Luxburg. Local ordinal embedding. In International Conference on Machine Learning, pages 847–855. PMLR, 2014.
  60. L. L. Thurstone. The measurement of values. Psychological review, 61(1):47, 1954.
  61. W. S. Torgerson. Multidimensional scaling: I. theory and method. Psychometrika, 17(4):401–419, 1952.
  62. L. Van Der Maaten and K. Weinberger. Stochastic triplet embedding. In 2012 IEEE International Workshop on Machine Learning for Signal Processing, pages 1–6. IEEE, 2012.
  63. Insights into ordinal embedding algorithms: A systematic evaluation. arXiv preprint arXiv:1912.01666, 2019.
  64. S. Vikram and S. Dasgupta. Interactive bayesian hierarchical clustering. In International Conference on Machine Learning, pages 2081–2090. PMLR, 2016.
  65. A duality view of spectral methods for dimensionality reduction. In Proceedings of the 23rd international conference on Machine learning, pages 1041–1048, 2006.
Citations (1)

Summary

We haven't generated a summary for this paper yet.