Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Advancing Transformer's Capabilities in Commonsense Reasoning (2310.06803v1)

Published 10 Oct 2023 in cs.CL, cs.AI, and cs.LG

Abstract: Recent advances in general purpose pre-trained LLMs have shown great potential in commonsense reasoning. However, current works still perform poorly on standard commonsense reasoning benchmarks including the Com2Sense Dataset. We argue that this is due to a disconnect with current cutting-edge machine learning methods. In this work, we aim to bridge the gap by introducing current ML-based methods to improve general purpose pre-trained LLMs in the task of commonsense reasoning. Specifically, we experiment with and systematically evaluate methods including knowledge transfer, model ensemble, and introducing an additional pairwise contrastive objective. Our best model outperforms the strongest previous works by ~15\% absolute gains in Pairwise Accuracy and ~8.7\% absolute gains in Standard Accuracy.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Yu Zhou (335 papers)
  2. Yunqiu Han (2 papers)
  3. Hanyu Zhou (19 papers)
  4. Yulun Wu (22 papers)