Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Deductive Additivity for Planning of Natural Language Proofs (2307.02472v2)

Published 5 Jul 2023 in cs.CL and cs.AI

Abstract: Current natural language systems designed for multi-step claim validation typically operate in two phases: retrieve a set of relevant premise statements using heuristics (planning), then generate novel conclusions from those statements using a LLM (deduction). The planning step often requires expensive Transformer operations and does not scale to arbitrary numbers of premise statements. In this paper, we investigate whether an efficient planning heuristic is possible via embedding spaces compatible with deductive reasoning. Specifically, we evaluate whether embedding spaces exhibit a property we call deductive additivity: the sum of premise statement embeddings should be close to embeddings of conclusions based on those premises. We explore multiple sources of off-the-shelf dense embeddings in addition to fine-tuned embeddings from GPT3 and sparse embeddings from BM25. We study embedding models both intrinsically, evaluating whether the property of deductive additivity holds, and extrinsically, using them to assist planning in natural language proof generation. Lastly, we create a dataset, Single-Step Reasoning Contrast (SSRC), to further probe performance on various reasoning types. Our findings suggest that while standard embedding methods frequently embed conclusions near the sums of their premises, they fall short of being effective heuristics and lack the ability to model certain categories of reasoning.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (36)
  1. Does the Whole Exceed Its Parts? The Effect of AI Explanations on Complementary Team Performance. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems, New York, NY, USA. Association for Computing Machinery.
  2. Translating embeddings for modeling multi-relational data. In Advances in Neural Information Processing Systems, volume 26. Curran Associates, Inc.
  3. Natural language deduction through search over statement compositions. In Findings of the Association for Computational Linguistics: EMNLP 2022, pages 4871–4883, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
  4. Flexible generation of natural language deductions. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 6266–6278, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
  5. Language models are few-shot learners. In Advances in Neural Information Processing Systems, volume 33, pages 1877–1901. Curran Associates, Inc.
  6. Multi-hop question answering via reasoning chains. arXiv, abs/1910.02610.
  7. A simple framework for contrastive learning of visual representations. In Proceedings of the 37th International Conference on Machine Learning, volume 119 of Proceedings of Machine Learning Research, pages 1597–1607. PMLR.
  8. Improved baselines with momentum contrastive learning. CoRR, abs/2003.04297.
  9. Selection-inference: Exploiting large language models for interpretable logical reasoning. In The Eleventh International Conference on Learning Representations.
  10. Explaining answers with entailment trees. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 7358–7370, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
  11. SimCSE: Simple contrastive learning of sentence embeddings. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 6894–6910, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
  12. Evaluating models’ local decision boundaries via contrast sets. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 1307–1323, Online. Association for Computational Linguistics.
  13. Peter Hase and Mohit Bansal. 2020. Evaluating explainable AI: Which algorithmic explanations help users predict model behavior? In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 5540–5552, Online. Association for Computational Linguistics.
  14. Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9729–9738.
  15. METGEN: A module-based entailment tree generation framework for answer explanation. In Findings of the Association for Computational Linguistics: NAACL 2022, pages 1887–1905, Seattle, United States. Association for Computational Linguistics.
  16. Logical fallacy detection. In Findings of the Association for Computational Linguistics: EMNLP 2022, pages 7180–7198, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
  17. Dense passage retrieval for open-domain question answering. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 6769–6781, Online. Association for Computational Linguistics.
  18. Omar Khattab and Matei Zaharia. 2020. Colbert: Efficient and effective passage search via contextualized late interaction over bert. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’20, page 39–48, New York, NY, USA. Association for Computing Machinery.
  19. WANLI: Worker and AI collaboration for natural language inference dataset creation. In Findings of the Association for Computational Linguistics: EMNLP 2022, pages 6826–6847, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
  20. Multi-hop reading comprehension through question decomposition and rescoring. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 6097–6109, Florence, Italy. Association for Computational Linguistics.
  21. PyTorch Metric Learning. ArXiv, abs/2008.09164.
  22. Answering while summarizing: Multi-task learning for multi-hop QA with evidence extraction. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 2335–2345, Florence, Italy. Association for Computational Linguistics.
  23. OpenAI. 2023. GPT-4 Technical Report.
  24. Learning transferable visual models from natural language supervision. In International conference on machine learning, pages 8748–8763. PMLR.
  25. Query2box: Reasoning over knowledge graphs in vector space using box embeddings. In International Conference on Learning Representations.
  26. Entailment tree explanations via iterative retrieval-generation reasoner. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Online and Seattle, USA. Association for Computational Linguistics.
  27. Okapi at TREC-3. In Overview of the Third Text REtrieval Conference (TREC-3), pages 109–126. Gaithersburg, MD: NIST.
  28. Natural language deduction with incomplete information. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 8230–8258, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
  29. A comparative study of question answering over knowledge bases. In International Conference on Advanced Data Mining and Applications, pages 259–274. Springer.
  30. Hybrid autoregressive inference for scalable multi-hop explanation regeneration. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pages 11403–11411.
  31. Answering complex open-domain questions with multi-hop dense retrieval. In International Conference on Learning Representations.
  32. RCOT: Detecting and Rectifying Factual Inconsistency in Reasoning by Reversing Chain-of-Thought. ArXiv, abs/2305.11499.
  33. Kaiyu Yang and Jia Deng. 2023. Learning symbolic rules for reasoning in quasi-natural language. Transactions on Machine Learning Research (TMLR).
  34. Generating natural language proofs with verifier-guided search. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 89–105, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
  35. The Unreliability of Explanations in Few-shot Prompting for Textual Reasoning. In Advances in Neural Information Processing Systems.
  36. How language model hallucinations can snowball. arXiv preprint arXiv:2305.13534.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Zayne Sprague (10 papers)
  2. Kaj Bostrom (7 papers)
  3. Swarat Chaudhuri (61 papers)
  4. Greg Durrett (117 papers)
Citations (3)