Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

On Robustness of Finetuned Transformer-based NLP Models (2305.14453v2)

Published 23 May 2023 in cs.CL

Abstract: Transformer-based pretrained models like BERT, GPT-2 and T5 have been finetuned for a large number of NLP tasks, and have been shown to be very effective. However, while finetuning, what changes across layers in these models with respect to pretrained checkpoints is under-studied. Further, how robust are these models to perturbations in input text? Does the robustness vary depending on the NLP task for which the models have been finetuned? While there exists some work on studying the robustness of BERT finetuned for a few NLP tasks, there is no rigorous study that compares this robustness across encoder only, decoder only and encoder-decoder models. In this paper, we characterize changes between pretrained and finetuned LLM representations across layers using two metrics: CKA and STIR. Further, we study the robustness of three LLMs (BERT, GPT-2 and T5) with eight different text perturbations on classification tasks from the General Language Understanding Evaluation (GLUE) benchmark, and generation tasks like summarization, free-form generation and question generation. GPT-2 representations are more robust than BERT and T5 across multiple types of input perturbation. Although models exhibit good robustness broadly, dropping nouns, verbs or changing characters are the most impactful. Overall, this study provides valuable insights into perturbation-specific weaknesses of popular Transformer-based models, which should be kept in mind when passing inputs. We make the code and models publicly available [https://github.com/PavanNeerudu/Robustness-of-Transformers-models].

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Pavan Kalyan Reddy Neerudu (2 papers)
  2. Subba Reddy Oota (21 papers)
  3. Mounika Marreddy (9 papers)
  4. Venkateswara Rao Kagita (11 papers)
  5. Manish Gupta (67 papers)
Citations (6)
Github Logo Streamline Icon: https://streamlinehq.com