Negative Data Augmentation (2102.05113v1)

Published 9 Feb 2021 in cs.CV and cs.AI

Abstract: Data augmentation is often used to enlarge datasets with synthetic samples generated in accordance with the underlying data distribution. To enable a wider range of augmentations, we explore negative data augmentation strategies (NDA)that intentionally create out-of-distribution samples. We show that such negative out-of-distribution samples provide information on the support of the data distribution, and can be leveraged for generative modeling and representation learning. We introduce a new GAN training objective where we use NDA as an additional source of synthetic data for the discriminator. We prove that under suitable conditions, optimizing the resulting objective still recovers the true data distribution but can directly bias the generator towards avoiding samples that lack the desired structure. Empirically, models trained with our method achieve improved conditional/unconditional image generation along with improved anomaly detection capabilities. Further, we incorporate the same negative data augmentation strategy in a contrastive learning framework for self-supervised representation learning on images and videos, achieving improved performance on downstream image classification, object detection, and action recognition tasks. These results suggest that prior knowledge on what does not constitute valid data is an effective form of weak supervision across a range of unsupervised learning tasks.

Authors (6)

Abhishek Sinha (60 papers)
Kumar Ayush (18 papers)
Jiaming Song (78 papers)
Burak Uzkent (18 papers)
Hongxia Jin (64 papers)
Stefano Ermon (279 papers)

Citations (72)

View on Semantic Scholar

Summary

An Overview of "Negative Data Augmentation"

The paper "Negative Data Augmentation" explores a novel approach to data augmentation in machine learning by introducing the concept of Negative Data Augmentation (NDA). Unlike the traditional data augmentation techniques that focus on generating in-distribution samples to expand the dataset, NDA involves intentionally creating out-of-distribution samples. These negative samples are crucial because they deliver information regarding the boundaries of the data distribution, and the paper suggests that this knowledge can be instrumental in enhancing generative modeling and representation learning.

Core Contributions

The paper presents several key advancements:

Negative Data Augmentation for GANs: The authors integrate NDA into the Generative Adversarial Networks (GANs) setting by formulating a new training objective. In this setup, negative samples are used as an additional source of synthetic data for the discriminator. It is theorized and demonstrated that under suitable conditions, this method helps in biasing the generator to avoid undesirable samples while still enabling recovery of the true data distribution. The empirical results indicate improved performance in both conditional and unconditional image generation, as well as in anomaly detection scenarios.
Contrastive Learning with NDA: Extending the concepts to self-supervised learning, the paper also introduces NDA into a contrastive learning framework. The proposed contrastive predictive coding objective leverages negative data to enforce separation between in-support and NDA-derived representation distributions. This strategy enhances performance in downstream tasks like image classification, object detection, and action recognition.

Results and Implications

The empirical studies conducted include various tasks and datasets, such as CIFAR-10, CIFAR-100, and ImageNet-100, among others. Results consistently show improvements across these benchmarks when NDA is applied. For instance, during GAN-based image generation tasks, utilizing spatially corruptive transformations as NDA led to significantly lower Fréchet Inception Distance (FID) scores, indicating better generation quality.

For contrastive learning applications, the integration of NDA yielded notable performance improvements in learning representations for image and video data. The theoretical insights establish how NDA samples provide valuable prior knowledge on what to avoid, thereby guiding the learning algorithms toward more effective solutions.

Theoretical Foundation and Analysis

The paper provides rigorous theoretical justification for the proposed methods. It addresses how integrating NDA into GAN objective functions does not alter the optimal solution for the generator in the infinite data regime but effectively filters out suboptimal solutions in practical finite data scenarios. Similarly, in the concept of contrastive learning, the theorem presented proves that NDA encourages the learned representations to distinctly separate in-support samples from out-of-support samples, thus improving data robustness.

Future Directions

The research opens up various future avenues in AI and machine learning:

Exploration of Novel NDA Strategies: Future research could explore a broader range of tasks and domains for NDA application, including natural language processing and other non-image based data modalities.
Enhancing Representation Learning: Beyond improving GANs, further investigations can deepen the exploitation of NDA in other forms of representation learning frameworks to refine their ability to discern in-distribution vs. out-of-distribution scenarios more distinctly.
Weak Supervision and Inductive Bias: By formalizing how knowledge of invalid data informs learning algorithms, further paper can refine these methods to contribute significantly to semi-supervised or weakly supervised learning frameworks where labeled data is scarce or costly to obtain.

In sum, the paper establishes the practicality and theoretical soundness of leveraging negative out-of-distribution data samples to enhance both generative modeling and representation learning. It contributes a novel perspective on data augmentation that broadens the horizon for future research and application in AI.

PDF Markdown

Related Papers

YouTube

Show All Videos