Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Go Beyond Plain Fine-tuning: Improving Pretrained Models for Social Commonsense (2105.05913v1)

Published 12 May 2021 in cs.CL

Abstract: Pretrained LLMs have demonstrated outstanding performance in many NLP tasks recently. However, their social intelligence, which requires commonsense reasoning about the current situation and mental states of others, is still developing. Towards improving LLMs' social intelligence, we focus on the Social IQA dataset, a task requiring social and emotional commonsense reasoning. Building on top of the pretrained RoBERTa and GPT2 models, we propose several architecture variations and extensions, as well as leveraging external commonsense corpora, to optimize the model for Social IQA. Our proposed system achieves competitive results as those top-ranking models on the leaderboard. This work demonstrates the strengths of pretrained LLMs, and provides viable ways to improve their performance for a particular task.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Ting-Yun Chang (10 papers)
  2. Yang Liu (2253 papers)
  3. Karthik Gopalakrishnan (34 papers)
  4. Behnam Hedayatnia (27 papers)
  5. Pei Zhou (30 papers)
  6. Dilek Hakkani-Tur (94 papers)