Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Accurate Use of Label Dependency in Multi-Label Text Classification Through the Lens of Causality (2310.07588v1)

Published 11 Oct 2023 in cs.CL, cs.AI, and cs.LG

Abstract: Multi-Label Text Classification (MLTC) aims to assign the most relevant labels to each given text. Existing methods demonstrate that label dependency can help to improve the model's performance. However, the introduction of label dependency may cause the model to suffer from unwanted prediction bias. In this study, we attribute the bias to the model's misuse of label dependency, i.e., the model tends to utilize the correlation shortcut in label dependency rather than fusing text information and label dependency for prediction. Motivated by causal inference, we propose a CounterFactual Text Classifier (CFTC) to eliminate the correlation bias, and make causality-based predictions. Specifically, our CFTC first adopts the predict-then-modify backbone to extract precise label information embedded in label dependency, then blocks the correlation shortcut through the counterfactual de-bias technique with the help of the human causal graph. Experimental results on three datasets demonstrate that our CFTC significantly outperforms the baselines and effectively eliminates the correlation bias in datasets.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Caoyun Fan (8 papers)
  2. Wenqing Chen (16 papers)
  3. Jidong Tian (13 papers)
  4. Yitian Li (9 papers)
  5. Hao He (99 papers)
  6. Yaohui Jin (40 papers)
Citations (5)