Prompt Tuning Pushes Farther, Contrastive Learning Pulls Closer: A Two-Stage Approach to Mitigate Social Biases (2307.01595v1)
Abstract: As the representation capability of Pre-trained LLMs (PLMs) improve, there is growing concern that they will inherit social biases from unprocessed corpora. Most previous debiasing techniques used Counterfactual Data Augmentation (CDA) to balance the training corpus. However, CDA slightly modifies the original corpus, limiting the representation distance between different demographic groups to a narrow range. As a result, the debiasing model easily fits the differences between counterfactual pairs, which affects its debiasing performance with limited text resources. In this paper, we propose an adversarial training-inspired two-stage debiasing model using Contrastive learning with Continuous Prompt Augmentation (named CCPA) to mitigate social biases in PLMs' encoding. In the first stage, we propose a data augmentation method based on continuous prompt tuning to push farther the representation distance between sample pairs along different demographic groups. In the second stage, we utilize contrastive learning to pull closer the representation distance between the augmented sample pairs and then fine-tune PLMs' parameters to get debiased encoding. Our approach guides the model to achieve stronger debiasing performance by adding difficulty to the training process. Extensive experiments show that CCPA outperforms baselines in terms of debiasing performance. Meanwhile, experimental results on the GLUE benchmark show that CCPA retains the LLMing capability of PLMs.
- Emily M. Bender and Batya Friedman. 2018. Data statements for natural language processing: Toward mitigating system bias and enabling better science. Trans. Assoc. Comput. Linguistics, TACL, 6:587–604.
- Man is to computer programmer as woman is to homemaker? debiasing word embeddings. In Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems,NeurIPS, pages 4349–4357.
- Reinforced counterfactual data augmentation for dual sentiment classification. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, EMNLP, pages 269–278. Association for Computational Linguistics.
- Fairfil: Contrastive neural debiasing method for pretrained text encoders. In Proceedings of the 9th International Conference on Learning Representations, ICLR. OpenReview.net.
- Conditional supervised contrastive learning for fair text classification. CoRR, abs/2205.11485.
- ELECTRA: pre-training text encoders as discriminators rather than generators. In Proceedings of the 8th International Conference on Learning Representations, ICLR. OpenReview.net.
- Container: Few-shot named entity recognition via contrastive learning. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, ACL, pages 6338–6353. Association for Computational Linguistics.
- Bias in bios: A case study of semantic representation bias in a high-stakes setting. In Proceedings of the Conference on Fairness, Accountability, and Transparency, FAT, pages 120–128. ACM.
- Measuring fairness with biased rulers: A comparative study on bias metrics for pre-trained language models. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL, pages 1693–1706. Association for Computational Linguistics.
- On measuring and mitigating biased inferences of word embeddings. In Proceedings of the 34th AAAI Conference on Artificial Intelligence, pages 7659–7666. AAAI Press.
- Oscar: Orthogonal subspace correction and rectification of biases in word embeddings. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, EMNLP, pages 5034–5050. Association for Computational Linguistics.
- BERT: pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL, pages 4171–4186. Association for Computational Linguistics.
- Auto-debias: Debiasing masked language models with automated biased prompts. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, ACL, pages 1012–1023. Association for Computational Linguistics.
- Michael Gutmann and Aapo Hyvärinen. 2010. Noise-contrastive estimation: A new estimation principle for unnormalized statistical models. In Proceedings of the 13th International Conference on Artificial Intelligence and Statistics, AISTATS, volume 9 of JMLR Proceedings, pages 297–304. JMLR.org.
- Diverse adversaries for mitigating bias in training. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, EACL, pages 2760–2765. Association for Computational Linguistics.
- MABEL: Attenuating gender bias using textual entailment data. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP.
- Xiang Lisa Li and Percy Liang. 2021. Prefix-tuning: Optimizing continuous prompts for generation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL/IJCNLP, pages 4582–4597. Association for Computational Linguistics.
- Towards debiasing sentence representations. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL, pages 5502–5515. Association for Computational Linguistics.
- Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing. CoRR, abs/2107.13586.
- GPT understands, too. CoRR, abs/2103.10385.
- Gender bias in neural natural language processing. In Logic, Language, and Security - Essays Dedicated to Andre Scedrov on the Occasion of His 65th Birthday, volume 12300 of Lecture Notes in Computer Science, pages 189–202. Springer.
- On measuring social biases in sentence encoders. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT, pages 622–628. Association for Computational Linguistics.
- An empirical survey of the effectiveness of debiasing techniques for pre-trained language models. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, ACL, pages 1878–1898. Association for Computational Linguistics.
- Pointer sentinel mixture models. In Proceedings of the 5th International Conference on Learning Representations, ICLR. OpenReview.net.
- Stereoset: Measuring stereotypical bias in pretrained language models. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL/IJCNLP, pages 5356–5371. Association for Computational Linguistics.
- Crows-pairs: A challenge dataset for measuring social biases in masked language models. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, EMNLP, pages 1953–1967. Association for Computational Linguistics.
- Computational analysis of persuasiveness in social multimedia: A novel dataset and multimodal prediction approach. In Proceedings of the 16th International Conference on Multimodal Interaction, ICMI, pages 50–57. ACM.
- Deep contextualized word representations. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL, pages 2227–2237. Association for Computational Linguistics.
- Language models as knowledge bases? In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, EMNLP-IJCNLP, pages 2463–2473. Association for Computational Linguistics.
- MELD: A multimodal multi-party dataset for emotion recognition in conversations. In Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL, pages 527–536. Association for Computational Linguistics.
- Null it out: Guarding protected attributes by iterative nullspace projection. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL, pages 7237–7256. Association for Computational Linguistics.
- Linear adversarial concept erasure. In Proceedings of the International Conference on Machine Learning, ICML, volume 162 of Proceedings of Machine Learning Research, pages 18400–18421. PMLR.
- Distilbert, a distilled version of BERT: smaller, faster, cheaper and lighter. CoRR, abs/1910.01108.
- Timo Schick and Hinrich Schütze. 2020. Few-shot text generation with pattern-exploiting training. CoRR, abs/2012.11926.
- Timo Schick and Hinrich Schütze. 2021. Exploiting cloze-questions for few-shot text classification and natural language inference. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, EACL 2021, Online, April 19 - 23, 2021, pages 255–269. Association for Computational Linguistics.
- Contrastive learning for fair representations. CoRR, abs/2109.10645.
- Autoprompt: Eliciting knowledge from language models with automatically generated prompts. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, EMNLP, pages 4222–4235. Association for Computational Linguistics.
- Recursive deep models for semantic compositionality over a sentiment treebank. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, EMNLP, pages 1631–1642. ACL.
- Tl;dr: Mining reddit to learn automatic summarization. In Proceedings of the Workshop on New Frontiers in Summarization, NFiS@EMNLP, pages 59–63. Association for Computational Linguistics.
- Promda: Prompt-based data augmentation for low-resource NLU tasks. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, ACL, pages 4242–4255. Association for Computational Linguistics.
- Measuring and reducing gendered correlations in pre-trained models. CoRR, abs/2010.06032.
- Transformers: State-of-the-art natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, EMNLP, pages 38–45. Association for Computational Linguistics.
- Reducing word omission errors in neural machine translation: A contrastive learning approach. In Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL, pages 6191–6196. Association for Computational Linguistics.
- Gender bias in contextualized word embeddings. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL, pages 629–634. Association for Computational Linguistics.
- Factual probing is [MASK]: learning vs. learning to recall. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT, pages 5017–5033. Association for Computational Linguistics.
- Counterfactual data augmentation for mitigating gender stereotypes in languages with rich morphology. In Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL, pages 1651–1661. Association for Computational Linguistics.
- Yingji Li (5 papers)
- Mengnan Du (90 papers)
- Xin Wang (1306 papers)
- Ying Wang (366 papers)