MAT: Mixed-Strategy Game of Adversarial Training in Fine-tuning (2306.15826v1)
Abstract: Fine-tuning large-scale pre-trained LLMs has been demonstrated effective for various NLP tasks. Previous studies have established that incorporating adversarial training during the fine-tuning stage can significantly enhance model generalization and robustness. However, from the perspective of game theory, such utilizations of adversarial training correspond to pure-strategy games, which are inherently limited in terms of the scope of their strategies, thereby still having room for improvement. In order to push the performance boundaries, we propose a novel Mixed-strategy Adversarial Training algorithm (MAT). Methodologically, we derive the Nash equilibrium of a mixed-strategy game for adversarial training using Entropy Mirror Descent to establish MAT by sampling method. To verify the effectiveness of MAT, we conducted extensive benchmark experiments on large-scale pre-trained models, such as BERT and RoBERTa. MAT significantly outperforms the state-of-the-art methods on both the GLUE and ANLI benchmarks in terms of generalization and robustness.
- Better fine-tuning by reducing representational collapse. In ICLR, 2021.
- Invariant risk minimization games. In ICML, 2020.
- Generalization and equilibrium in generative adversarial nets (gans). In ICML, 2017.
- The second PASCAL recognising textual entailment challenge. In Proceedings of the Second PASCAL Challenges Workshop on Recognising Textual Entailment, 2006.
- Mirror descent and nonlinear projected subgradient methods for convex optimization. Operations Research Letters, 2003.
- The fifth PASCAL recognizing textual entailment challenge. In Proceedings of the Second Text Analysis Conference, 2009.
- A large annotated corpus for learning natural language inference. In EMNLP, 2015.
- Language models are few-shot learners. In NeurIPS, 2020.
- SemEval-2017 task 1: Semantic textual similarity multilingual and crosslingual focused evaluation. In Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017), 2017.
- The PASCAL recognising textual entailment challenge. In First PASCAL Machine Learning Challenges Workshop, 2005.
- Training gans with optimism. In ICLR, 2018.
- BERT: pre-training of deep bidirectional transformers for language understanding. In NAACL-HLT, 2019.
- Automatically constructing a corpus of sentential paraphrases. In Proceedings of the Third International Workshop on Paraphrasing, 2005.
- The third PASCAL recognizing textual entailment challenge. In Proceedings of the ACL-PASCAL@ACL 2007 Workshop on Textual Entailment and Paraphrasing, 2007.
- Generative adversarial nets. In NeurIPS, 2014.
- Explaining and harnessing adversarial examples. In ICLR, 2015.
- Rmsprop: Divide the gradient by a running average of its recent magnitude. Coursera, 2012.
- Finding mixed nash equilibria of generative adversarial networks. In ICML, 2019.
- First quora dataset release: Question pairs. Technical report, Quora, 2017.
- SMART: robust and efficient fine-tuning for pre-trained natural language models through principled regularized optimization. In ACL, 2020.
- Is BERT really robust? A strong baseline for natural language attack on text classification and entailment. In AAAI, 2020.
- Adam: A method for stochastic optimization. In ICLR, 2015.
- Hector J. Levesque. The winograd schema challenge. In AAAI Spring Symposium, 2011.
- Datasets: A community library for natural language processing. In EMNLP, 2021.
- Preconditioned stochastic gradient langevin dynamics for deep neural networks. In AAAI, 2016.
- Deep text classification can be fooled. In IJCAI, 2018.
- Roberta: A robustly optimized BERT pretraining approach. CoRR, abs/1907.11692, 2019.
- Adversarial training for large neural language models. CoRR, abs/2004.08994, 2020.
- Towards deep learning models resistant to adversarial attacks. In ICLR, 2018.
- Adversarial training methods for semi-supervised text classification. In ICLR, 2017.
- John Nash. Non-cooperative games. Annals of Mathematics, 54(2):286–295, 1951.
- Arkadi Nemirovski and D. Yudin. Problem complexity and method efficiency in optimization. Wiley, 1983.
- Adversarial NLI: A new benchmark for natural language understanding. In ACL, 2020.
- Pytorch: An imperative style, high-performance deep learning library. In NeurIPS, 2019.
- Improving language understanding by generative pre-training. OpenAI blog, 2018.
- Language models are unsupervised multitask learners. OpenAI blog, 2019.
- Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research, 2020.
- Squad: 100, 000+ questions for machine comprehension of text. In EMNLP, 2016.
- Adversarial training for free! In NeurIPS, 2019.
- Recursive deep models for semantic compositionality over a sentiment treebank. In EMNLP, 2013.
- Intriguing properties of neural networks. In ICLR, 2014.
- FEVER: a large-scale dataset for fact extraction and verification. In NAACL-HLT, 2018.
- GLUE: A multi-task benchmark and analysis platform for natural language understanding. In ICLR, 2019.
- Transferable adversarial examples can efficiently fool topic models. Computers & Security, 118:102749, 2022.
- Neural network acceptability judgments. Transactions of the Association for Computational Linguistics, 2019.
- Bayesian learning via stochastic gradient langevin dynamics. In ICML, 2011.
- A broad-coverage challenge corpus for sentence understanding through inference. In NAACL-HLT, 2018.
- Transformers: State-of-the-art natural language processing. In EMNLP, 2020.
- You only propagate once: Accelerating adversarial training via maximal principle. In NeurIPS, 2019.
- Evalda: Efficient evasion attacks towards latent dirichlet allocation. In AAAI, 2021.
- Freelb: Enhanced adversarial training for natural language understanding. In ICLR, 2020.
- Adversarial regularization as stackelberg game: An unrolled optimization approach. In EMNLP, 2021.