Adversarial Training for Large Neural Language Models (2004.08994v2)

Published 20 Apr 2020 in cs.CL

Abstract: Generalization and robustness are both key desiderata for designing machine learning methods. Adversarial training can enhance robustness, but past work often finds it hurts generalization. In NLP, pre-training large neural LLMs such as BERT have demonstrated impressive gain in generalization for a variety of tasks, with further improvement from adversarial fine-tuning. However, these models are still vulnerable to adversarial attacks. In this paper, we show that adversarial pre-training can improve both generalization and robustness. We propose a general algorithm ALUM (Adversarial training for large neural LLMs), which regularizes the training objective by applying perturbations in the embedding space that maximizes the adversarial loss. We present the first comprehensive study of adversarial training in all stages, including pre-training from scratch, continual pre-training on a well-trained model, and task-specific fine-tuning. ALUM obtains substantial gains over BERT on a wide range of NLP tasks, in both regular and adversarial scenarios. Even for models that have been well trained on extremely large text corpora, such as RoBERTa, ALUM can still produce significant gains from continual pre-training, whereas conventional non-adversarial methods can not. ALUM can be further combined with task-specific fine-tuning to attain additional gains. The ALUM code is publicly available at https://github.com/namisan/mt-dnn.

Authors (7)

Xiaodong Liu (162 papers)
Hao Cheng (190 papers)
Pengcheng He (60 papers)
Weizhu Chen (128 papers)
Yu Wang (939 papers)
Hoifung Poon (61 papers)
Jianfeng Gao (344 papers)

Citations (177)

View on Semantic Scholar

Summary

Adversarial Training for Large Neural LLMs

The paper "Adversarial Training for Large Neural LLMs" explores the application of adversarial training to large neural LLMs, focusing on both pre-training and fine-tuning stages. The authors introduce the Adversarial training for large neural LLMs (ALUM) algorithm, claiming it can significantly enhance both generalization and robustness across various NLP tasks.

Core Contributions

The paper makes several key contributions:

ALUM Algorithm: ALUM augments standard training by implementing adversarial perturbations in the embedding space, effectively tackling robustness without sacrificing generalization. Unlike previous adversarial techniques that limit their application to task-specific fine-tuning, ALUM encompasses all training stages, including pre-training.
Comprehensive Evaluation: ALUM is evaluated across prominent NLP benchmarks, such as GLUE, ANLI, and SQuAD, demonstrating superior performance over established models like BERT and RoBERTa. Notably, it yields significant improvements even for RoBERTa, which typically shows diminishing returns in additional pre-training without adversarial components.
Integration with Fine-Tuning: The paper highlights that combining ALUM with task-specific adversarial fine-tuning leads to further gains, showcasing its utility in robustifying and optimizing model performance in adversarial settings.

Strong Results

ALUM's effectiveness is illustrated by consistent performance improvements across multiple tasks. For instance, the model shows substantial gains over BERT on various datasets such as MNLI and SQuAD, even in the face of adversarial challenges like HELLASWAG and Adversarial SQuAD. These enhancements in both adversarial and regular task performances underscore ALUM's capability to reconcile the often observed dichotomy between generalization and robustness.

Implications and Future Directions

The dual focus on enhancing generalization and robustness has practical implications for deploying NLP systems in real-world scenarios where robustness to adversarial attacks can be critical. The approach also theoretically paves the way for broader adoption of adversarial training in language pre-training, suggesting that leveraging adversarial techniques in self-supervised settings might bridge observed conflicts in supervised learning scenarios.

Future research directions could involve exploring the computational cost of adversarial training due to increased complexity and investigating further accelerating techniques. Extending the applicability of ALUM to other domains or architectures beyond transformer-based models may also prove beneficial.

Overall, this paper contributes valuable insights into leveraging adversarial training within large-scale LLMs, emphasizing the method's potential to enhance NLP applications expansively. The release of the ALUM code facilitates further advancements by the research community, fostering continued exploration and innovation in the field.

Related Papers

Find Related Papers

GitHub

GitHub - namisan/mt-dnn: Multi-Task Deep Neural Networks for Natural Language Understanding (2,240 stars)