DADA: Dialect Adaptation via Dynamic Aggregation of Linguistic Rules (2305.13406v3)
Abstract: Existing LLMs that mainly focus on Standard American English (SAE) often lead to significantly worse performance when being applied to other English dialects. While existing mitigations tackle discrepancies for individual target dialects, they assume access to high-accuracy dialect identification systems. The boundaries between dialects are inherently flexible, making it difficult to categorize language into discrete predefined categories. In this paper, we propose DADA (Dialect Adaptation via Dynamic Aggregation), a modular approach to imbue SAE-trained models with multi-dialectal robustness by composing adapters which handle specific linguistic features. The compositional architecture of DADA allows for both targeted adaptation to specific dialect variants and simultaneous adaptation to various dialects. We show that DADA is effective for both single task and instruction finetuned LLMs, offering an extensible and interpretable framework for adapting existing LLMs to different English dialects.
- Findings of the VarDial evaluation campaign 2022. In Proceedings of the Ninth Workshop on NLP for Similar Languages, Varieties and Dialects, pages 1–13, Gyeongju, Republic of Korea. Association for Computational Linguistics.
- Findings of the vardial evaluation campaign 2023. In Workshop on NLP for Similar Languages, Varieties and Dialects.
- Palm 2 technical report.
- ATTEMPT: Parameter-efficient multi-task tuning via attentional mixtures of soft prompts. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 6655–6672, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- Steven Bird. 2022. Local languages, third spaces, and other high-resource scenarios. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 7817–7829, Dublin, Ireland. Association for Computational Linguistics.
- Automatically processing tweets from gang-involved youth: Towards detecting loss and aggression. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pages 2196–2206, Osaka, Japan. The COLING 2016 Organizing Committee.
- Su Lin Blodgett and Brendan T. O’Connor. 2017. Racial disparity in natural language processing: A case study of social media african-american english. ArXiv, abs/1707.00061.
- Twitter Universal Dependency parsing for African-American and mainstream American English. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1415–1425, Melbourne, Australia. Association for Computational Linguistics.
- On the opportunities and risks of foundation models. ArXiv.
- Language models are few-shot learners. In Advances in Neural Information Processing Systems, volume 33, pages 1877–1901. Curran Associates, Inc.
- Rich Caruana. 1997. Multitask learning. Mach. Learn., 28(1):41–75.
- SemEval-2017 task 1: Semantic textual similarity multilingual and crosslingual focused evaluation. In Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017), pages 1–14, Vancouver, Canada. Association for Computational Linguistics.
- Findings of the VarDial evaluation campaign 2021. In Proceedings of the Eighth Workshop on NLP for Similar Languages, Varieties and Dialects, pages 1–11, Kiyv, Ukraine. Association for Computational Linguistics.
- Scaling instruction-finetuned language models.
- Racial bias in hate speech and abusive language detection datasets. In Proceedings of the Third Workshop on Abusive Language Online, pages 25–35, Florence, Italy. Association for Computational Linguistics.
- Learning to recognize dialect features. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 2315–2338.
- Xavier Garcia and Orhan Firat. 2022. Using natural language prompts for machine translation. arXiv preprint arXiv:2202.11822.
- Lisa J. Green. 2002. African American English: A Linguistic Introduction. Cambridge University Press.
- Mitigating racial biases in toxic language detection with an equity-based ensemble framework. In Equity and Access in Algorithms, Mechanisms, and Optimization, EAAMO ’21, New York, NY, USA. Association for Computing Machinery.
- Mitigating racial biases in toxic language detection with an equity-based ensemble framework. Equity and Access in Algorithms, Mechanisms, and Optimization.
- Towards a unified view of parameter-efficient transfer learning. In International Conference on Learning Representations.
- Tada: Task-agnostic dialect adapters for english.
- Parameter-efficient transfer learning for NLP. In Proceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, pages 2790–2799. PMLR.
- Dirk Hovy and Shannon L. Spruit. 2016. The social impact of natural language processing. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 591–598, Berlin, Germany. Association for Computational Linguistics.
- Learning a POS tagger for AAVE-like language. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 1115–1120, San Diego, California. Association for Computational Linguistics.
- Incorporating dialectal variability for socially equitable language identification. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 51–57, Vancouver, Canada. Association for Computational Linguistics.
- Svetlana Kiritchenko and Saif Mohammad. 2018. Examining gender and race bias in two hundred sentiment analysis systems. In Proceedings of the Seventh Joint Conference on Lexical and Computational Semantics, pages 43–53, New Orleans, Louisiana. Association for Computational Linguistics.
- Racial disparities in automated speech recognition. Proceedings of the National Academy of Sciences, 117(14):7684–7689.
- eWAVE.
- The power of scale for parameter-efficient prompt tuning. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 3045–3059, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Xiang Lisa Li and Percy Liang. 2021. Prefix-tuning: Optimizing continuous prompts for generation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 4582–4597, Online. Association for Computational Linguistics.
- Holistic evaluation of language models. arXiv preprint arXiv:2211.09110.
- What makes good in-context examples for GPT-3? In Proceedings of Deep Learning Inside Out (DeeLIO 2022): The 3rd Workshop on Knowledge Extraction and Integration for Deep Learning Architectures, pages 100–114, Dublin, Ireland and Online. Association for Computational Linguistics.
- Multi-task deep neural networks for natural language understanding. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 4487–4496, Florence, Italy. Association for Computational Linguistics.
- Semantic-oriented unlabeled priming for large-scale language models.
- SMoA: Sparse mixture of adapters to mitigate multiple dataset biases.
- Roberta: A robustly optimized bert pretraining approach.
- Discriminating between similar languages and Arabic dialect identification: A report on the third DSL shared task. In Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial3), pages 1–14, Osaka, Japan. The COLING 2016 Organizing Committee.
- Stefan Martin and Walt Wolfram. 2021. The sentence in african-american vernacular english. In African-American English, pages 11–40. Routledge.
- Hate speech detection and racial bias mitigation in social media based on bert model. PLoS ONE, 15.
- Program synthesis with large language models. In n/a, page n/a, n/a. N/a.
- OpenAI. 2023a. ChatGPT. https://openai.com/research/chatgpt.
- OpenAI. 2023b. Gpt-4 technical report.
- Training language models to follow instructions with human feedback. In Advances in Neural Information Processing Systems.
- AdapterFusion: Non-destructive task composition for transfer learning. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pages 487–503, Online. Association for Computational Linguistics.
- AdapterHub: A framework for adapting transformers. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 46–54, Online. Association for Computational Linguistics.
- Scaling language models: Methods, analysis & insights from training gopher.
- Anthony Rios. 2020. Fuzze: Fuzzy fairness evaluation of offensive language classifiers on african-american english. Proceedings of the AAAI Conference on Artificial Intelligence, 34(01):881–889.
- Multitask prompted training enables zero-shot task generalization. In International Conference on Learning Representations.
- The risk of racial bias in hate speech detection. In Annual Meeting of the Association for Computational Linguistics.
- Language models are multilingual chain-of-thought reasoners. arXiv preprint arXiv:2210.03057.
- Recursive deep models for semantic compositionality over a sentiment treebank. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pages 1631–1642, Seattle, Washington, USA. Association for Computational Linguistics.
- Irene Solaiman and Christy Dennison. 2021. Process for adapting language models to society (PALMS) with values-targeted datasets. In Advances in Neural Information Processing Systems.
- Dialect-robust evaluation of generated text. arXiv preprint arXiv:2211.00922.
- Attention is all you need. In Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc.
- GLUE: A multi-task benchmark and analysis platform for natural language understanding. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, pages 353–355, Brussels, Belgium. Association for Computational Linguistics.
- Neural network acceptability judgments. arXiv preprint arXiv:1805.12471.
- Finetuned language models are zero-shot learners. In International Conference on Learning Representations.
- A broad-coverage challenge corpus for sentence understanding through inference. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 1112–1122, New Orleans, Louisiana. Association for Computational Linguistics.
- Language variety identification with true labels. ArXiv, abs/2303.01490.
- Challenges in automated debiasing for toxic language detection. ArXiv, abs/2102.00086.
- VALUE: Understanding dialect disparity in NLU. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 3701–3720, Dublin, Ireland. Association for Computational Linguistics.
- Multi-VALUE: A framework for cross-dialectal English NLP. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Toronto, Canada. Association for Computational Linguistics.