Mitigating Biases of Large Language Models in Stance Detection with Counterfactual Augmented Calibration (2402.14296v3)
Abstract: Stance detection is critical for understanding the underlying position or attitude expressed toward a topic. LLMs have demonstrated significant advancements across various natural language processing tasks including stance detection, however, their performance in stance detection is limited by biases and spurious correlations inherent due to their data-driven nature. Our statistical experiment reveals that LLMs are prone to generate biased stances due to sentiment-stance spurious correlations and preference towards certain individuals and topics. Furthermore, the results demonstrate a strong negative correlation between stance bias and stance detection performance, underscoring the importance of mitigating bias to enhance the utility of LLMs in stance detection. Therefore, in this paper, we propose a Counterfactual Augmented Calibration Network (FACTUAL), which a novel calibration network is devised to calibrate potential bias in the stance prediction of LLMs. Further, to address the challenge of effectively learning bias representations and the difficulty in the generalizability of debiasing, we construct counterfactual augmented data. This approach enhances the calibration network, facilitating the debiasing and out-of-domain generalization. Experimental results on in-target and zero-shot stance detection tasks show that the proposed FACTUAL can effectively mitigate biases of LLMs, achieving state-of-the-art results.
- Emily Allaway and Kathleen R. McKeown. 2020. Zero-shot stance detection: A dataset and model using generalized topic representations. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, EMNLP 2020, Online, November 16-20, 2020, pages 8913–8931. Association for Computational Linguistics.
- Integrating n-gram features into pre-trained model: A novel ensemble model for multi-target stance detection. In Artificial Neural Networks and Machine Learning - ICANN 2021 - 30th International Conference on Artificial Neural Networks, Bratislava, Slovakia, September 14-17, 2021, Proceedings, Part III, volume 12893 of Lecture Notes in Computer Science, pages 269–279. Springer.
- Co$^2$pt: Mitigating bias in pre-trained language models through counterfactual contrastive prompt tuning. CoRR, abs/2310.12490.
- Stance detection in web and social media: A comparative study. In Experimental IR Meets Multilinguality, Multimodality, and Interaction - 10th International Conference of the CLEF Association, CLEF 2019, Lugano, Switzerland, September 9-12, 2019, Proceedings, volume 11696 of Lecture Notes in Computer Science, pages 75–87. Springer.
- Gustavo Gonçalves and Emma Strubell. 2023. Understanding the effect of model compression on social bias in large language models. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, EMNLP 2023, Singapore, December 6-10, 2023, pages 2663–2675. Association for Computational Linguistics.
- Inducing political bias allows language models anticipate partisan reactions to controversies. CoRR, abs/2311.09687.
- Infusing knowledge from Wikipedia to enhance stance detection. In Proceedings of the 12th Workshop on Computational Approaches to Subjectivity, Sentiment & Social Media Analysis, pages 71–77, Dublin, Ireland. Association for Computational Linguistics.
- Knowledge-enhanced prompt-tuning for stance detection. ACM Transactions on Asian and Low-Resource Language Information Processing.
- Myungha Jang and James Allan. 2018. Explaining controversy on social media via stance summarization. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, SIGIR 2018, Ann Arbor, MI, USA, July 08-12, 2018, pages 1221–1224. ACM.
- Navigating the ocean of biases: Political bias attribution in language models via causal structures. CoRR, abs/2311.08605.
- The impact of debiasing on the performance of language models in downstream tasks is underestimated. CoRR, abs/2309.09092.
- twt-wt: A dataset to assert the role of target entities for detecting stance of tweets. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2021, Online, June 6-11, 2021, pages 3879–3889. Association for Computational Linguistics.
- Explaining the efficacy of counterfactually augmented data. In 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021. OpenReview.net.
- Stance detection with collaborative role-infused llm-based agents. CoRR, abs/2310.10467.
- Stance detection on social media with background knowledge. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, EMNLP 2023, Singapore, December 6-10, 2023, pages 15703–15717. Association for Computational Linguistics.
- P-stance: A large dataset for stance detection in political domain. In Findings of the Association for Computational Linguistics: ACL/IJCNLP 2021, Online Event, August 1-6, 2021, volume ACL/IJCNLP 2021 of Findings of ACL, pages 2355–2365. Association for Computational Linguistics.
- Jointcl: A joint contrastive learning framework for zero-shot stance detection. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2022, Dublin, Ireland, May 22-27, 2022, pages 81–91. Association for Computational Linguistics.
- Debiasing algorithm through model adaptation. CoRR, abs/2310.18913.
- Roberta: A robustly optimized BERT pretraining approach. CoRR, abs/1907.11692.
- An empirical study of catastrophic forgetting in large language models during continual fine-tuning. CoRR, abs/2308.08747.
- Semeval-2016 task 6: Detecting stance in tweets. In Proceedings of the 10th International Workshop on Semantic Evaluation, SemEval@NAACL-HLT 2016, San Diego, CA, USA, June 16-17, 2016, pages 31–41. The Association for Computer Linguistics.
- Bertweet: A pre-trained language model for english tweets. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, EMNLP 2020 - Demos, Online, November 16-20, 2020, pages 9–14. Association for Computational Linguistics.
- "im not racist but…": Discovering bias in the internal knowledge of large language models. CoRR, abs/2310.08780.
- Swapna Somasundaran and Janyce Wiebe. 2010. Recognizing stances in ideological on-line debates. In Proceedings of the NAACL HLT 2010 Workshop on Computational Approaches to Analysis and Generation of Emotion in Text, pages 116–124, Los Angeles, CA. Association for Computational Linguistics.
- Predicting the topical stance and political leaning of media using tweets. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, July 5-10, 2020, pages 527–537. Association for Computational Linguistics.
- Stance detection with hierarchical attention network. In Proceedings of the 27th International Conference on Computational Linguistics, COLING 2018, Santa Fe, New Mexico, USA, August 20-26, 2018, pages 2399–2409. Association for Computational Linguistics.
- Llama: Open and efficient foundation language models. CoRR, abs/2302.13971.
- Self-consistency improves chain of thought reasoning in language models. In The Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023. OpenReview.net.
- Exploring the efficacy of automatically generated counterfactuals for sentiment analysis. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL/IJCNLP 2021, (Volume 1: Long Papers), Virtual Event, August 1-6, 2021, pages 306–316. Association for Computational Linguistics.
- SSR: utilizing simplified stance reasoning process for robust stance detection. In Proceedings of the 29th International Conference on Computational Linguistics, COLING 2022, Gyeongju, Republic of Korea, October 12-17, 2022, pages 6846–6858. International Committee on Computational Linguistics.
- Debiasing stance detection models with counterfactual reasoning and adversarial bias learning. CoRR, abs/2212.10392.
- How would stance detection techniques evolve after the launch of chatgpt? CoRR, abs/2212.14548.
- A logically consistent chain-of-thought approach for stance detection. CoRR, abs/2312.16054.
- Investigating chain-of-thought with chatgpt for stance detection on social media. CoRR, abs/2304.03087.
- Large language models are not robust multiple choice selectors. CoRR, abs/2309.03882.
- Explore spurious correlations at the concept level in language models for text classification. CoRR, abs/2311.08648.
- Enhancing zero-shot stance detection via targeted background knowledge. In SIGIR ’22: The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, Madrid, Spain, July 11 - 15, 2022, pages 2070–2075. ACM.
- Ang Li (472 papers)
- Jingqian Zhao (2 papers)
- Bin Liang (115 papers)
- Lin Gui (66 papers)
- Hui Wang (371 papers)
- Xi Zeng (5 papers)
- Kam-Fai Wong (92 papers)
- Ruifeng Xu (66 papers)
- Xingwei Liang (2 papers)