Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

On the Relationship between Sentence Analogy Identification and Sentence Structure Encoding in Large Language Models (2310.07818v3)

Published 11 Oct 2023 in cs.CL and cs.AI

Abstract: The ability of LLMs to encode syntactic and semantic structures of language is well examined in NLP. Additionally, analogy identification, in the form of word analogies are extensively studied in the last decade of LLMing literature. In this work we specifically look at how LLMs' abilities to capture sentence analogies (sentences that convey analogous meaning to each other) vary with LLMs' abilities to encode syntactic and semantic structures of sentences. Through our analysis, we find that LLMs' ability to identify sentence analogies is positively correlated with their ability to encode syntactic and semantic structures of sentences. Specifically, we find that the LLMs which capture syntactic structures better, also have higher abilities in identifying sentence analogies.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (44)
  1. Fine-grained analysis of sentence embeddings using auxiliary prediction tasks. arXiv preprint arXiv:1608.04207.
  2. Analogies between sentences: Theoretical aspects-preliminary experiments. In Symbolic and Quantitative Approaches to Reasoning with Uncertainty: 16th European Conference, ECSQARU 2021, Prague, Czech Republic, September 21–24, 2021, Proceedings 16, pages 3–18. Springer.
  3. Abstract meaning representation for sembanking. In Proceedings of the 7th linguistic annotation workshop and interoperability with discourse, pages 178–186.
  4. E-kar: A benchmark for rationalizing natural language analogical reasoning. In Findings of the Association for Computational Linguistics: ACL 2022, pages 3941–3955.
  5. Electra: Pre-training text encoders as discriminators rather than generators. arXiv preprint arXiv:2003.10555.
  6. What you can cram into a single vector: Probing sentence embeddings for linguistic properties. arXiv preprint arXiv:1805.01070.
  7. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
  8. Word embeddings, analogies, and machine learning: Beyond king-man+ woman= queen. In Proceedings of coling 2016, the 26th international conference on computational linguistics: Technical papers, pages 3519–3530.
  9. Structures, not strings: Linguistics as part of the cognitive sciences. Trends in cognitive sciences, 19(12):729–743.
  10. Wordrep: A benchmark for research on learning word representations. arXiv preprint arXiv:1407.1640.
  11. Dedre Gentner. 1983. Structure-mapping: A theoretical framework for analogy. Cognitive science, 7(2):155–170.
  12. Analogy-based detection of morphological and semantic relations with word embeddings: what works and what doesn’t. In Proceedings of the NAACL Student Research Workshop, pages 8–15, San Diego, California. Association for Computational Linguistics.
  13. The conll-2009 shared task: Syntactic and semantic dependencies in multiple languages. In Proceedings of the Thirteenth Conference on Computational Natural Language Learning (CoNLL 2009): Shared Task, pages 1–18.
  14. John Hewitt and Christopher D. Manning. 2019. A structural probe for finding syntax in word representations. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4129–4138, Minneapolis, Minnesota. Association for Computational Linguistics.
  15. Douglas R Hofstadter. 2001. Analogy as the core of cognition. The analogical mind: Perspectives from cognitive science, pages 499–538.
  16. The place of analogy in cognition.
  17. Spanbert: Improving pre-training by representing and predicting spans. Transactions of the Association for Computational Linguistics, 8:64–77.
  18. Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942.
  19. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692.
  20. Prasanta Chandra Mahalanobis. 1936. On the generalized distance in statistics. National Institute of Science of India.
  21. Christopher Manning and Hinrich Schutze. 1999. Foundations of statistical natural language processing. MIT press.
  22. Building a large annotated corpus of English: The Penn Treebank. Computational Linguistics, 19(2):313–330.
  23. The conll 2007 shared task on dependency parsing. In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), pages 915–932.
  24. Semeval 2015 task 18: Broad-coverage semantic dependency parsing. In Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015), pages 915–926.
  25. SemEval 2014 task 8: Broad-coverage semantic dependency parsing. In Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014), pages 63–72, Dublin, Ireland. Association for Computational Linguistics.
  26. Constituency Parsing. 2009. Speech and language processing. Power Point Slides.
  27. Deep contextualized word representations. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 2227–2237, New Orleans, Louisiana. Association for Computational Linguistics.
  28. Pareto probing: Trading off accuracy for complexity. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 3138–3153, Online. Association for Computational Linguistics.
  29. Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res., 21(140):1–67.
  30. Does string-based neural mt learn source syntax? In Proceedings of the 2016 conference on empirical methods in natural language processing, pages 1526–1534.
  31. Oren Sultan and Dafna Shahaf. 2022. Life is a circus and we are the clowns: Automatically finding analogies between situations and processes. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 3547–3562, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
  32. The conll 2008 shared task on joint parsing of syntactic and semantic dependencies. In CoNLL 2008: Proceedings of the Twelfth Conference on Computational Natural Language Learning, pages 159–177.
  33. Terrence Szymanski. 2017. Temporal word analogies: Identifying lexical replacement with diachronic word embeddings. In Proceedings of the 55th annual meeting of the association for computational linguistics (volume 2: short papers), pages 448–453.
  34. BERT is to NLP what AlexNet is to CV: Can pre-trained language models identify analogies? In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 3609–3624, Online. Association for Computational Linguistics.
  35. Elena Voita and Ivan Titov. 2020. Information-theoretic probing with minimum description length. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 183–196, Online. Association for Computational Linguistics.
  36. Superglue: A stickier benchmark for general-purpose language understanding systems. Advances in neural information processing systems, 32.
  37. GLUE: A multi-task benchmark and analysis platform for natural language understanding. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, pages 353–355, Brussels, Belgium. Association for Computational Linguistics.
  38. Liyan Wang and Yves Lepage. 2020. Vector-to-sequence models for sentence analogies. In 2020 International Conference on Advanced Computer Science and Information Systems (ICACSIS), pages 441–446. IEEE.
  39. Second-order semantic dependency parsing with end-to-end neural networks. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 4609–4618.
  40. ANALOGICAL - a novel benchmark for long text analogy evaluation in large language models. In Findings of the Association for Computational Linguistics: ACL 2023, pages 3534–3549, Toronto, Canada. Association for Computational Linguistics.
  41. Xlnet: Generalized autoregressive pretraining for language understanding. Advances in neural information processing systems, 32.
  42. Linkbert: Pretraining language models with document links. In Association for Computational Linguistics (ACL).
  43. Analogykb: Unlocking analogical reasoning of language models with a million-scale knowledge base. arXiv preprint arXiv:2305.05994.
  44. Xunjie Zhu and Gerard de Melo. 2020. Sentence analogies: Linguistic regularities in sentence embeddings. In Proceedings of the 28th International Conference on Computational Linguistics, pages 3389–3400, Barcelona, Spain (Online). International Committee on Computational Linguistics.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Thilini Wijesiriwardene (7 papers)
  2. Ruwan Wickramarachchi (12 papers)
  3. Aishwarya Naresh Reganti (4 papers)
  4. Vinija Jain (43 papers)
  5. Aman Chadha (110 papers)
  6. Amit Sheth (127 papers)
  7. Amitava Das (45 papers)