Adapters Mixup: Mixing Parameter-Efficient Adapters to Enhance the Adversarial Robustness of Fine-tuned Pre-trained Text Classifiers (2401.10111v2)
Abstract: Existing works show that augmenting the training data of pre-trained LLMs (PLMs) for classification tasks fine-tuned via parameter-efficient fine-tuning methods (PEFT) using both clean and adversarial examples can enhance their robustness under adversarial attacks. However, this adversarial training paradigm often leads to performance degradation on clean inputs and requires frequent re-training on the entire data to account for new, unknown attacks. To overcome these challenges while still harnessing the benefits of adversarial training and the efficiency of PEFT, this work proposes a novel approach, called AdpMixup, that combines two paradigms: (1) fine-tuning through adapters and (2) adversarial augmentation via mixup to dynamically leverage existing knowledge from a set of pre-known attacks for robust inference. Intuitively, AdpMixup fine-tunes PLMs with multiple adapters with both clean and pre-known adversarial examples and intelligently mixes them up in different ratios during prediction. Our experiments show AdpMixup achieves the best trade-off between training efficiency and robustness under both pre-known and unknown attacks, compared to existing baselines on five downstream tasks across six varied black-box attacks and 2 PLMs. All source code will be available.
- Explaining and harnessing adversarial examples. ICLR.
- Parameter-efficient transfer learning for nlp. In ICML, pages 2790–2799. PMLR.
- Lora: Low-rank adaptation of large language models. In ICLR.
- Averaging Weights Leads to Wider Optima and Better Generalization. Uncertainty in Artificial Intelligence.
- Is BERT really robust? A strong baseline for natural language attack on text classification and entailment. In Proceedings of AAAI 2020.
- RoBERTa: A Robustly Optimized BERT Pretraining approach. arXiv.
- Virtual adversarial training: a regularization method for supervised and semi-supervised learning. IEEE transactions on pattern analysis and machine intelligence.
- What is being transferred in transfer learning? NeurIPS.
- Adapterfusion: Non-destructive task composition for transfer learning. arXiv preprint arXiv:2005.00247.
- Generating Natural Language Adversarial Examples through Probability Weighted Word Saliency. In Proceedings of ACL 2019.
- Better robustness by more coverage: Adversarial and mixup data augmentation for robust finetuning. In Findings of the Association for Computational Linguistics: ACL-IJCNLP.
- It’s morphin’ time! Combating linguistic discrimination with inflectional perturbations. In Proceedings of ACL 2020.
- InfoBERT: Improving Robustness of Language Models from An Information Theoretic Perspective. In Proceedings of ICLR 2021.
- CAT-gen: Improving robustness in NLP models via controlled adversarial text generation. In Proceedings of EMNLP 2020.
- Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time. In ICML. PMLR.
- Adversarial Examples Improve Image Recognition. arXiv preprint arXiv:1911.09665.
- To be robust or to be fair: Towards fairness in adversarial training. In International conference on machine learning, pages 11492–11501. PMLR.
- Ties-merging: Resolving interference when merging models. In Thirty-seventh Conference on Neural Information Processing Systems.
- On the Robustness of Language Encoders against Grammatical Errors. In Proceedings of ACL 2020.
- Language models are super mario: Absorbing abilities from homologous models as a free lunch. arXiv preprint arXiv:2311.03099.
- mixup: Beyond empirical risk minimization. ICLR.
- mixup: Beyond empirical risk minimization. In Processings of ICLR 2018.
- Freelb: Enhanced adversarial training for natural language understanding. ICLR.
- A Reinforced Generation of Adversarial Samples for Neural Machine Translation. In Proceedings of ACL 2020.
- Tuc Nguyen (4 papers)
- Thai Le (38 papers)