Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
60 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

APOLLO: A Simple Approach for Adaptive Pretraining of Language Models for Logical Reasoning (2212.09282v2)

Published 19 Dec 2022 in cs.CL, cs.AI, and cs.LG

Abstract: Logical reasoning of text is an important ability that requires understanding the information present in the text, their interconnections, and then reasoning through them to infer new conclusions. Prior works on improving the logical reasoning ability of LLMs require complex processing of training data (e.g., aligning symbolic knowledge to text), yielding task-specific data augmentation solutions that restrict the learning of general logical reasoning skills. In this work, we propose APOLLO, an adaptively pretrained LLM that has improved logical reasoning abilities. We select a subset of Wikipedia, based on a set of logical inference keywords, for continued pretraining of a LLM. We use two self-supervised loss functions: a modified masked LLMing loss where only specific parts-of-speech words, that would likely require more reasoning than basic language understanding, are masked, and a sentence-level classification loss that teaches the model to distinguish between entailment and contradiction types of sentences. The proposed training paradigm is both simple and independent of task formats. We demonstrate the effectiveness of APOLLO by comparing it with prior baselines on two logical reasoning datasets. APOLLO performs comparably on ReClor and outperforms baselines on LogiQA. The code base has been made publicly available.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Soumya Sanyal (16 papers)
  2. Yichong Xu (42 papers)
  3. Shuohang Wang (69 papers)
  4. Ziyi Yang (77 papers)
  5. Reid Pryzant (17 papers)
  6. Wenhao Yu (139 papers)
  7. Chenguang Zhu (100 papers)
  8. Xiang Ren (194 papers)
Citations (8)