Smoothing Entailment Graphs with Language Models (2208.00318v2)
Abstract: The diversity and Zipfian frequency distribution of natural language predicates in corpora leads to sparsity in Entailment Graphs (EGs) built by Open Relation Extraction (ORE). EGs are computationally efficient and explainable models of natural language inference, but as symbolic models, they fail if a novel premise or hypothesis vertex is missing at test-time. We present theory and methodology for overcoming such sparsity in symbolic models. First, we introduce a theory of optimal smoothing of EGs by constructing transitive chains. We then demonstrate an efficient, open-domain, and unsupervised smoothing method using an off-the-shelf LLM to find approximations of missing premise predicates. This improves recall by 25.1 and 16.3 percentage points on two difficult directional entailment datasets, while raising average precision and maintaining model explainability. Further, in a QA task we show that EG smoothing is most useful for answering questions with lesser supporting text, where missing premise predicates are more costly. Finally, controlled experiments with WordNet confirm our theory and show that hypothesis smoothing is difficult, but possible in principle.
- Efficient global learning of entailment graphs. Computational Linguistics, 41(2):221–263.
- Modality and negation in event extraction. In Proceedings of the 4th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE 2021), pages 31–42, Online. Association for Computational Linguistics.
- Language models are few-shot learners. In Advances in Neural Information Processing Systems, volume 33, pages 1877–1901. Curran Associates, Inc.
- Sharon A. Caraballo and Eugene Charniak. 1999. Determining the specificity of nouns from text. In 1999 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora.
- Stanley F. Chen and Joshua Goodman. 1996. An empirical study of smoothing techniques for language modeling. In 34th Annual Meeting of the Association for Computational Linguistics, pages 310–318, Santa Cruz, California, USA. Association for Computational Linguistics.
- Entailment graph learning with textual entailment and soft transitivity. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 5899–5910, Dublin, Ireland. Association for Computational Linguistics.
- BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186, Minneapolis, Minnesota. Association for Computational Linguistics.
- Christiane Fellbaum. 1998. WordNet: An Electronic Lexical Database. Bradford Books.
- Incorporating non-local information into information extraction systems by gibbs sampling. In Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, ACL ’05, page 363–370, USA. Association for Computational Linguistics.
- Maayan Geffet and Ido Dagan. 2005. The distributional inclusion hypotheses and lexical entailment. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05), pages 107–114, Ann Arbor, Michigan. Association for Computational Linguistics.
- Liane Guillou and Sander Bijl de Vroe. 2023. Ant dataset.
- Incorporating temporal information in entailment graph mining. In Proceedings of the Graph-based Methods for Natural Language Processing (TextGraphs), pages 60–71, Barcelona, Spain (Online). Association for Computational Linguistics.
- Blindness to modality helps entailment graph mining. In Proceedings of the Second Workshop on Insights from Negative Results in NLP, pages 110–116, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
- An empirical analysis of compute-optimal large language model training. In Advances in Neural Information Processing Systems, volume 35, pages 30016–30030. Curran Associates, Inc.
- Xavier Holt. 2018. Probabilistic models of relational implication. Master’s thesis, Macquarie University.
- Mohammad Javad Hosseini. 2021. Unsupervised Learning of Relational Entailment Graphs from Text. Ph.D. thesis, University of Edinburgh.
- Learning typed entailment graphs with global soft constraints. Transactions of the Association for Computational Linguistics, 6:703–717.
- Duality of link prediction and entailment graph induction. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 4736–4746, Florence, Italy. Association for Computational Linguistics.
- Open-domain contextual link prediction and its complementarity with entailment graphs. In Findings of the Association for Computational Linguistics: EMNLP 2021, pages 2790–2802, Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Dimitri Kartsaklis and Mehrnoosh Sadrzadeh. 2016. Distributional inclusion hypothesis for tensor-based composition. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pages 2849–2860, Osaka, Japan. The COLING 2016 Organizing Committee.
- Omer Levy and Ido Dagan. 2016. Annotating relation inference in context via question answering. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 249–255, Berlin, Germany. Association for Computational Linguistics.
- Mike Lewis and Mark Steedman. 2013. Combined distributional and logical semantics. Transactions of the Association for Computational Linguistics, 1:179–192.
- On the sentence embeddings from pre-trained language models. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 9119–9130, Online. Association for Computational Linguistics.
- Language models are poor learners of directional inference. In Findings of the Association for Computational Linguistics: EMNLP 2022, pages 903–921, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- Cross-lingual inference with a Chinese entailment graph. In Findings of the Association for Computational Linguistics: ACL 2022, pages 1214–1233, Dublin, Ireland. Association for Computational Linguistics.
- Xiao Ling and Daniel S. Weld. 2012. Fine-grained entity recognition. In Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence, AAAI’12, page 94–100. AAAI Press.
- Roberta: A robustly optimized bert pretraining approach.
- The Stanford CoreNLP natural language processing toolkit. In Association for Computational Linguistics (ACL) System Demonstrations, pages 55–60.
- Multivalent entailment graphs for question answering. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 10758–10768, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Relationships among goodness-of-example, category norms, and word frequency. Bulletin of the psychonomic society, 7(3):283–284.
- Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825–2830.
- Deep contextualized word representations. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 2227–2237, New Orleans, Louisiana. Association for Computational Linguistics.
- Eleanor Rosch and Carolyn B Mervis. 1975. Family resemblances: Studies in the internal structure of categories. Cognitive Psychology, 7(4):573–605.
- Basic objects in natural categories. Cognitive Psychology, 8(3):382–439.
- Martin Schmitt and Hinrich Schütze. 2021. Language models for lexical inference in context. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pages 1267–1280, Online. Association for Computational Linguistics.
- Vered Shwartz and Yejin Choi. 2020. Do neural language models overcome reporting bias? In Proceedings of the 28th International Conference on Computational Linguistics, pages 6863–6870, Barcelona, Spain (Online). International Committee on Computational Linguistics.
- Karen Spärck Jones. 1972. A statistical interpretation of term specificity and its application in retrieval. Journal of documentation.
- Shuntaro Takahashi and Kumiko Tanaka-Ishii. 2017. Do neural nets learn statistical laws behind natural language? PLOS ONE, 12(12):1–17.
- Machine translationese: Effects of algorithmic bias on linguistic complexity in machine translation. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pages 2203–2213, Online. Association for Computational Linguistics.
- Emergent abilities of large language models. Transactions on Machine Learning Research. Survey Certification.
- David Yarowsky. 1993. One sense per collocation. In Human Language Technology: Proceedings of a Workshop Held at Plainsboro, New Jersey, March 21-24, 1993.