Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

In Search of the Long-Tail: Systematic Generation of Long-Tail Inferential Knowledge via Logical Rule Guided Search (2311.07237v3)

Published 13 Nov 2023 in cs.CL and cs.AI

Abstract: To effectively use LLMs for real-world queries, it is imperative that they generalize to the long-tail distribution, i.e. rare examples where models exhibit low confidence. In this work, we take the first step towards evaluating LLMs in the long-tail distribution of inferential knowledge. We exemplify long-tail evaluation on the Natural Language Inference task. First, we introduce Logic-Induced-Knowledge-Search (LINK), a systematic long-tail data generation framework, to obtain factually-correct yet long-tail inferential statements. LINK uses variable-wise prompting grounded on symbolic rules to seek low-confidence statements while ensuring factual correctness. We then use LINK to curate Logic-Induced-Long-Tail (LINT), a large-scale long-tail inferential knowledge dataset that contains 108K statements spanning four domains. We evaluate popular LLMs on LINT; we find that state-of-the-art LLMs show significant performance drop (21% relative drop for GPT4) on long-tail data as compared to on head distribution data, and smaller models show even more generalization weakness. These results further underscore the necessity of long-tail evaluation in developing generalizable LLMs.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (10)
  1. Huihan Li (5 papers)
  2. Yuting Ning (8 papers)
  3. Zeyi Liao (14 papers)
  4. Siyuan Wang (73 papers)
  5. Xiang Lorraine Li (20 papers)
  6. Ximing Lu (52 papers)
  7. Faeze Brahman (47 papers)
  8. Wenting Zhao (44 papers)
  9. Yejin Choi (287 papers)
  10. Xiang Ren (194 papers)
Citations (1)