Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A Systematic Investigation of Commonsense Knowledge in Large Language Models (2111.00607v3)

Published 31 Oct 2021 in cs.CL

Abstract: LLMs (LMs) trained on large amounts of data have shown impressive performance on many NLP tasks under the zero-shot and few-shot setup. Here we aim to better understand the extent to which such models learn commonsense knowledge -- a critical component of many NLP applications. We conduct a systematic and rigorous zero-shot and few-shot commonsense evaluation of large pre-trained LMs, where we: (i) carefully control for the LMs' ability to exploit potential surface cues and annotation artefacts, and (ii) account for variations in performance that arise from factors that are not related to commonsense knowledge. Our findings highlight the limitations of pre-trained LMs in acquiring commonsense knowledge without task-specific supervision; furthermore, using larger models or few-shot evaluation are insufficient to achieve human-level commonsense performance.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Xiang Lorraine Li (20 papers)
  2. Adhiguna Kuncoro (18 papers)
  3. Jordan Hoffmann (14 papers)
  4. Cyprien de Masson d'Autume (14 papers)
  5. Phil Blunsom (87 papers)
  6. Aida Nematzadeh (24 papers)
Citations (50)