2000 character limit reached
Contrastive Loss is All You Need to Recover Analogies as Parallel Lines (2306.08221v1)
Published 14 Jun 2023 in cs.CL
Abstract: While static word embedding models are known to represent linguistic analogies as parallel lines in high-dimensional space, the underlying mechanism as to why they result in such geometric structures remains obscure. We find that an elementary contrastive-style method employed over distributional information performs competitively with popular word embedding models on analogy recovery tasks, while achieving dramatic speedups in training time. Further, we demonstrate that a contrastive loss is sufficient to create these parallel structures in word embeddings, and establish a precise relationship between the co-occurrence statistics and the geometric structure of the resulting word embeddings.
- Carl Allen and Timothy Hospedales. 2019. Analogies explained: Towards understanding word embeddings.
- A latent variable model approach to PMI-based word embeddings. Transactions of the Association for Computational Linguistics, 4:385–399.
- A latent variable model approach to pmi-based word embeddings.
- Multimodal distributional semantics. J. Artif. Intell. Res., 49:1–47.
- Learning a similarity metric discriminatively, with application to face verification. In 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), volume 1, pages 539–546 vol. 1.
- Towards understanding linear word analogies. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 3253–3262, Florence, Italy. Association for Computational Linguistics.
- Placing search in context: the concept revisited. ACM Trans. Inf. Syst., 20:116–131.
- J. Firth. 1957. A synopsis of linguistic theory 1930-1955. In Studies in Linguistic Analysis. Philological Society, Oxford. Reprinted in Palmer, F. (ed. 1968) Selected Papers of J. R. Firth, Longman, Harlow.
- Wikimedia Foundation. 2023. Wikimedia downloads.
- Louis Fournier and Ewan Dunbar. 2021. Paraphrases do not explain word analogies.
- Analogies minus analogy test: measuring regularities in word embeddings. In Proceedings of the 24th Conference on Computational Natural Language Learning, pages 365–375, Online. Association for Computational Linguistics.
- Skip-gram - Zipf + uniform = vector additivity. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 69–76, Vancouver, Canada. Association for Computational Linguistics.
- Analogy-based detection of morphological and semantic relations with word embeddings: what works and what doesn’t. In Proceedings of the NAACL Student Research Workshop, pages 8–15, San Diego, California. Association for Computational Linguistics.
- SimLex-999: Evaluating semantic models with (genuine) similarity estimation. Computational Linguistics, 41(4):665–695.
- Omer Levy and Yoav Goldberg. 2014. Neural word embedding as implicit matrix factorization. In Advances in Neural Information Processing Systems, volume 27. Curran Associates, Inc.
- Tal Linzen. 2016. Issues in evaluating semantic spaces using word analogies. In Proceedings of the 1st Workshop on Evaluating Vector-Space Representations for NLP, pages 13–18, Berlin, Germany. Association for Computational Linguistics.
- Efficient estimation of word representations in vector space. In International Conference on Learning Representations.
- Linguistic regularities in continuous space word representations. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 746–751, Atlanta, Georgia. Association for Computational Linguistics.
- GloVe: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1532–1543, Doha, Qatar. Association for Computational Linguistics.
- Natalie Schluter. 2018. The word analogy testing caveat. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers), pages 242–246, New Orleans, Louisiana. Association for Computational Linguistics.
- Distance metric learning for large margin nearest neighbor classification. In Advances in Neural Information Processing Systems, volume 18. MIT Press.