Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Syntactic Data Augmentation Increases Robustness to Inference Heuristics (2004.11999v1)

Published 24 Apr 2020 in cs.CL

Abstract: Pretrained neural models such as BERT, when fine-tuned to perform natural language inference (NLI), often show high accuracy on standard datasets, but display a surprising lack of sensitivity to word order on controlled challenge sets. We hypothesize that this issue is not primarily caused by the pretrained model's limitations, but rather by the paucity of crowdsourced NLI examples that might convey the importance of syntactic structure at the fine-tuning stage. We explore several methods to augment standard training sets with syntactically informative examples, generated by applying syntactic transformations to sentences from the MNLI corpus. The best-performing augmentation method, subject/object inversion, improved BERT's accuracy on controlled examples that diagnose sensitivity to word order from 0.28 to 0.73, without affecting performance on the MNLI test set. This improvement generalized beyond the particular construction used for data augmentation, suggesting that augmentation causes BERT to recruit abstract syntactic representations.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Junghyun Min (6 papers)
  2. R. Thomas McCoy (33 papers)
  3. Dipanjan Das (42 papers)
  4. Emily Pitler (11 papers)
  5. Tal Linzen (73 papers)
Citations (170)

Summary

Syntactic Data Augmentation and Robustness in Natural Language Inference

The paper "Syntactic Data Augmentation Increases Robustness to Inference Heuristics" addresses a prominent issue in the field of NLP, specifically related to Natural Language Inference (NLI). Neural models like BERT, when fine-tuned for NLI tasks, often exhibit high accuracy on conventional datasets yet lack sensitivity to syntactic structures in controlled challenge sets. This paper investigates the root of this discrepancy and proposes syntactic data augmentation as an effective intervention.

Hypotheses: Representational Inadequacy and Missed Connection

The authors posit two primary explanations for BERT's unsatisfactory performance on syntactic sensitivity tasks:

  1. Representational Inadequacy Hypothesis: Suggests that BERT's pretrained embeddings might lack necessary syntactic details required for NLI.
  2. Missed Connection Hypothesis: Argues that although BERT may have learned syntactic features during pretraining, finetuning fails to leverage these features due to inadequate exemplification of syntactic constructs in the training data.

Methodology: Syntactic Augmentation Approaches

The researchers explore syntactic data augmentation by supplementing standard NLI datasets with syntactically informative examples derived from the MNLI corpus. These examples are generated through syntactic transformations, particularly inversion and passivization. The most promising augmentation method is subject/object inversion, whereby subjects and objects in sentences are swapped to create new training examples.

Results: Enhanced Performance and Generalization

The paper demonstrates significant improvements in syntactic sensitivity when augmenting the BERT training set with approximately 400 inversion-generated examples, a mere 0.1% increase over the MNLI dataset size. The accuracy on challenging syntactic examples rose from 0.28 to 0.73, showcasing that even a small amount of syntactic augmentation can lead to substantial model robustness. Furthermore, enhancements generalized beyond the augmented transformational type, implying abstract syntactic representations are recruited in the process.

Implications and Future Directions

The research suggests that minor syntactic augmentations can induce profound changes in model performance, circumventing entrenched heuristics such as lexical overlap inferences. Challenges remain in cases like passive constructions, which indicate representational inadequacy. Future exploration could involve expanding syntactic augmentation beyond MNLI examples to broader corpora, testing scalability and domain transferability. Moreover, addressing the construction-specific limitations through diversified augmentation strategies could reinforce model robustness.

Understanding the interface between pretraining and fine-tuning in LLMs remains vital for advancing model sensitivity to syntactic nuances. By influencing model behavior through syntactic data augmentation, this paper paves the way for more linguistically-grounded applications in NLI and broader AI fields.