Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
134 tokens/sec
GPT-4o
10 tokens/sec
Gemini 2.5 Pro Pro
47 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Contrastive Loss is All You Need to Recover Analogies as Parallel Lines (2306.08221v1)

Published 14 Jun 2023 in cs.CL

Abstract: While static word embedding models are known to represent linguistic analogies as parallel lines in high-dimensional space, the underlying mechanism as to why they result in such geometric structures remains obscure. We find that an elementary contrastive-style method employed over distributional information performs competitively with popular word embedding models on analogy recovery tasks, while achieving dramatic speedups in training time. Further, we demonstrate that a contrastive loss is sufficient to create these parallel structures in word embeddings, and establish a precise relationship between the co-occurrence statistics and the geometric structure of the resulting word embeddings.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (21)
  1. Carl Allen and Timothy Hospedales. 2019. Analogies explained: Towards understanding word embeddings.
  2. A latent variable model approach to PMI-based word embeddings. Transactions of the Association for Computational Linguistics, 4:385–399.
  3. A latent variable model approach to pmi-based word embeddings.
  4. Multimodal distributional semantics. J. Artif. Intell. Res., 49:1–47.
  5. Learning a similarity metric discriminatively, with application to face verification. In 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), volume 1, pages 539–546 vol. 1.
  6. Towards understanding linear word analogies. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 3253–3262, Florence, Italy. Association for Computational Linguistics.
  7. Placing search in context: the concept revisited. ACM Trans. Inf. Syst., 20:116–131.
  8. J. Firth. 1957. A synopsis of linguistic theory 1930-1955. In Studies in Linguistic Analysis. Philological Society, Oxford. Reprinted in Palmer, F. (ed. 1968) Selected Papers of J. R. Firth, Longman, Harlow.
  9. Wikimedia Foundation. 2023. Wikimedia downloads.
  10. Louis Fournier and Ewan Dunbar. 2021. Paraphrases do not explain word analogies.
  11. Analogies minus analogy test: measuring regularities in word embeddings. In Proceedings of the 24th Conference on Computational Natural Language Learning, pages 365–375, Online. Association for Computational Linguistics.
  12. Skip-gram - Zipf + uniform = vector additivity. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 69–76, Vancouver, Canada. Association for Computational Linguistics.
  13. Analogy-based detection of morphological and semantic relations with word embeddings: what works and what doesn’t. In Proceedings of the NAACL Student Research Workshop, pages 8–15, San Diego, California. Association for Computational Linguistics.
  14. SimLex-999: Evaluating semantic models with (genuine) similarity estimation. Computational Linguistics, 41(4):665–695.
  15. Omer Levy and Yoav Goldberg. 2014. Neural word embedding as implicit matrix factorization. In Advances in Neural Information Processing Systems, volume 27. Curran Associates, Inc.
  16. Tal Linzen. 2016. Issues in evaluating semantic spaces using word analogies. In Proceedings of the 1st Workshop on Evaluating Vector-Space Representations for NLP, pages 13–18, Berlin, Germany. Association for Computational Linguistics.
  17. Efficient estimation of word representations in vector space. In International Conference on Learning Representations.
  18. Linguistic regularities in continuous space word representations. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 746–751, Atlanta, Georgia. Association for Computational Linguistics.
  19. GloVe: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1532–1543, Doha, Qatar. Association for Computational Linguistics.
  20. Natalie Schluter. 2018. The word analogy testing caveat. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers), pages 242–246, New Orleans, Louisiana. Association for Computational Linguistics.
  21. Distance metric learning for large margin nearest neighbor classification. In Advances in Neural Information Processing Systems, volume 18. MIT Press.
Citations (3)

Summary

We haven't generated a summary for this paper yet.