Interpreting the Robustness of Neural NLP Models to Textual Perturbations (2110.07159v2)

Published 14 Oct 2021 in cs.CL

Abstract: Modern NLP models are known to be sensitive to input perturbations and their performance can decrease when applied to real-world, noisy data. However, it is still unclear why models are less robust to some perturbations than others. In this work, we test the hypothesis that the extent to which a model is affected by an unseen textual perturbation (robustness) can be explained by the learnability of the perturbation (defined as how well the model learns to identify the perturbation with a small amount of evidence). We further give a causal justification for the learnability metric. We conduct extensive experiments with four prominent NLP models -- TextRNN, BERT, RoBERTa and XLNet -- over eight types of textual perturbations on three datasets. We show that a model which is better at identifying a perturbation (higher learnability) becomes worse at ignoring such a perturbation at test time (lower robustness), providing empirical support for our hypothesis.

Authors (4)

Yunxiang Zhang (22 papers)
Liangming Pan (59 papers)
Samson Tan (21 papers)
Min-Yen Kan (92 papers)

Citations (19)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Interpreting the Robustness of Neural NLP Models to Textual Perturbations (2110.07159v2)

Summary

Related Papers