An Analysis of SSMix: A Saliency-Based Span Mixup Approach for Text Classification
This paper proposes SSMix, a novel data augmentation technique tailored for text classification tasks in NLP. Data augmentation using mixup strategies has been largely successful in computer vision, but their direct application in NLP presents challenges due to the discrete nature and variable lengths of text sequences. SSMix differs from traditional neural approaches by performing mixup operations directly on input text rather than on hidden representations.
SSMix innovates by leveraging saliency information to identify and replace spans in text, aiming to maximize the preservation of semantically significant tokens. The method effectively combines parts of two original texts by selectively replacing segments based on their contribution to the prediction, as measured by saliency maps. This attention to salient features not only maintains the locality inherent in the source texts but also enhances the semantic consistency of the synthesized examples.
The authors evaluate SSMix across a suite of established text classification benchmarks, including tasks such as sentiment classification, textual entailment, and question-type classification. These experiments demonstrate that SSMix consistently outperforms existing mixup techniques that operate at the hidden layer level. The methodology enhances generalization and robustness, showing particular strength in managing datasets with larger label sets and paired sentence constructs.
Key empirical findings indicate that the input-level mixup in SSMix captures a broader synthetic data space compared to linear interpolation methods like EmbedMix and TMix. The results suggest that SSMix is more effective when sufficient data is available, and when tasks involve multiple class labels, allowing cross-label augmentation to enhance diversity. Furthermore, the saliency-based span selection allows SSMix to strategically integrate text components more relevant to model predictions.
SSMix's design is also explored through an ablation paper to deconstruct the contributions of its core components: saliency and span restrictions. The findings confirm that both elements independently and collectively enhance performance, underscoring the importance of using saliency information for span selection and maintaining span-level integrity in the mixup process.
In summary, SSMix represents a promising advancement in NLP data augmentation, successfully adapting input-level mixup to the discrete and structured nature of text data. By focusing on token saliency, SSMix manages to produce meaningful augmented samples, addressing the intricacies of text classification. This work invites further exploration of saliency-augmented data creation methods across other NLP tasks, including text generation and semi-supervised learning settings. Future research can expand SSMix's application to a broader array of models and architectures, paving the way for refined regularization techniques in NLP.