Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Unsupervised Commonsense Question Answering with Self-Talk (2004.05483v2)

Published 11 Apr 2020 in cs.CL
Unsupervised Commonsense Question Answering with Self-Talk

Abstract: Natural language understanding involves reading between the lines with implicit background knowledge. Current systems either rely on pre-trained LLMs as the sole implicit source of world knowledge, or resort to external knowledge bases (KBs) to incorporate additional relevant knowledge. We propose an unsupervised framework based on self-talk as a novel alternative to multiple-choice commonsense tasks. Inspired by inquiry-based discovery learning (Bruner, 1961), our approach inquires LLMs with a number of information seeking questions such as "$\textit{what is the definition of ...}$" to discover additional background knowledge. Empirical results demonstrate that the self-talk procedure substantially improves the performance of zero-shot LLM baselines on four out of six commonsense benchmarks, and competes with models that obtain knowledge from external KBs. While our approach improves performance on several benchmarks, the self-talk induced knowledge even when leading to correct answers is not always seen as useful by human judges, raising interesting questions about the inner-workings of pre-trained LLMs for commonsense reasoning.

Unsupervised Commonsense Question Answering with Self-Talk

The paper presents an innovative approach to addressing commonsense question answering tasks utilizing pre-trained LLMs (LMs). Unlike conventional methods relying heavily on external knowledge bases (KBs) or task-specific supervised learning, the authors propose an unsupervised framework employing a "self-talk" mechanism. This method inquires an LM using various information-seeking questions to generate background knowledge necessary for answering multiple-choice commonsense questions.

Methodology

The self-talk approach is designed to tap into the implicit knowledge captured within LMs. It involves querying the LM with a series of generated "clarification questions" tailored to elucidate the context of a given question-answering task. These clarifications, in turn, assist in enhancing the LM's ability to discern the correct answer by providing additional context. This method is rooted in inquiry-based learning principles, where asking relevant questions leads to deeper understanding.

Crucially, this process is executed in a zero-shot manner, maximizing the utility of pre-trained LMs without additional task-specific fine-tuning or supervision. This is achieved through a procedure involving the selection and scoring of potential answers by combining context and acquired clarifications, evaluated against multiple commonsense benchmarks.

Results

The empirical evaluation demonstrates that the self-talk method enhances zero-shot performance on four out of six commonsense reasoning benchmarks. Notably, it competes favorably with approaches that integrate external KBs, suggesting the significant latent knowledge within LMs. The paper reports substantial improvements in benchmarks such as PIQA and COPA when using self-generated knowledge instead of external resources.

Despite the improvements, the paper acknowledges that the utility of the generated clarifications can be inconsistent when judged by human evaluators. This inconsistency raises intriguing questions about the reliability and nature of the reasoning exhibited by LMs when handling commonsense tasks.

Implications and Future Directions

The implications of this research are twofold. Practically, it offers a scalable alternative to resource-intensive supervised approaches, providing a methodologically economical route to enhance the performance of LMs on commonsense reasoning tasks. Theoretically, it challenges the necessity of external KBs for specific AI comprehensions tasks, advocating for a closer examination of the capabilities of LMs.

The paper invites further exploration into improving the reliability and factual correctness of generated clarifications. Future work could delve into structured approaches to enhance multi-step reasoning capabilities within LMs, possibly integrating mechanisms for self-evaluation of generated content. Additionally, the paper hints at a potential exploration of conversational strategies to refine clarification generation dynamically.

Overall, this work lays a foundational framework for leveraging latent LM capabilities in commonsense reasoning, driving forward both the practical application and theoretical understanding of artificial intelligence.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Vered Shwartz (49 papers)
  2. Peter West (76 papers)
  3. Ronan Le Bras (56 papers)
  4. Chandra Bhagavatula (46 papers)
  5. Yejin Choi (287 papers)
Citations (242)