Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Mitigating Shortcuts in Language Models with Soft Label Encoding (2309.09380v1)

Published 17 Sep 2023 in cs.CL and cs.LG

Abstract: Recent research has shown that LLMs rely on spurious correlations in the data for natural language understanding (NLU) tasks. In this work, we aim to answer the following research question: Can we reduce spurious correlations by modifying the ground truth labels of the training data? Specifically, we propose a simple yet effective debiasing framework, named Soft Label Encoding (SoftLE). We first train a teacher model with hard labels to determine each sample's degree of relying on shortcuts. We then add one dummy class to encode the shortcut degree, which is used to smooth other dimensions in the ground truth label to generate soft labels. This new ground truth label is used to train a more robust student model. Extensive experiments on two NLU benchmark tasks demonstrate that SoftLE significantly improves out-of-distribution generalization while maintaining satisfactory in-distribution accuracy.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Zirui He (3 papers)
  2. Huiqi Deng (12 papers)
  3. Haiyan Zhao (42 papers)
  4. Ninghao Liu (98 papers)
  5. Mengnan Du (90 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.