Papers
Topics
Authors
Recent
Search
2000 character limit reached

AutoMix: Unveiling the Power of Mixup for Stronger Classifiers

Published 24 Mar 2021 in cs.CV and cs.AI | (2103.13027v6)

Abstract: Data mixing augmentation have proved to be effective in improving the generalization ability of deep neural networks. While early methods mix samples by hand-crafted policies (e.g., linear interpolation), recent methods utilize saliency information to match the mixed samples and labels via complex offline optimization. However, there arises a trade-off between precise mixing policies and optimization complexity. To address this challenge, we propose a novel automatic mixup (AutoMix) framework, where the mixup policy is parameterized and serves the ultimate classification goal directly. Specifically, AutoMix reformulates the mixup classification into two sub-tasks (i.e., mixed sample generation and mixup classification) with corresponding sub-networks and solves them in a bi-level optimization framework. For the generation, a learnable lightweight mixup generator, Mix Block, is designed to generate mixed samples by modeling patch-wise relationships under the direct supervision of the corresponding mixed labels. To prevent the degradation and instability of bi-level optimization, we further introduce a momentum pipeline to train AutoMix in an end-to-end manner. Extensive experiments on nine image benchmarks prove the superiority of AutoMix compared with state-of-the-art in various classification scenarios and downstream tasks.

Citations (63)

Summary

  • The paper presents AutoMix, a framework that reformulates mixup training into interconnected tasks to enhance label consistency and computational efficiency.
  • It integrates a lightweight Mix Block with cross-attention and a Momentum Pipeline to stabilize optimization and accelerate convergence.
  • Extensive experiments demonstrate significant improvements in accuracy and robustness on datasets like CIFAR, ImageNet, and fine-grained classification benchmarks.

AutoMix: Unveiling the Power of Mixup for Stronger Classifiers

AutoMix introduces an innovative approach to address the persistent challenge in data augmentation for deep neural networks (DNNs)—the trade-off between optimal mixup policies and computational efficiency. This paper proposes a novel framework, AutoMix, that reformulates mixup-based training into two interconnected sub-tasks: mixed sample generation and mixup classification. The framework integrates these tasks within a bi-level optimization setup, offering significant improvements over prior methods in terms of both accuracy and computational overhead.

Framework and Methodology

AutoMix employs a parametric mixup strategy through its core module, Mix Block (MB). MB uses a cross-attention mechanism to generate adaptive masks that ensure mixup samples maintain discriminative features aligned with their mixed labels. The model bypasses the need for complex offline optimizations typical of previous methods, instead using a learnable light-weight mixup generator that operates under direct supervision of mixed labels. The integration of feature maps during the generation process enhances the relevance of patches selected for mixing, addressing label mismatch issues prevalent in traditional handcrafted methods.

To stabilize the bi-level optimization process and prevent degradation, AutoMix introduces a Momentum Pipeline. This mechanism uses a momentum-based updating rule to decouple the training process, allowing the classification task and mixup generation to be optimized simultaneously without entanglement issues. This end-to-end training strategy enables AutoMix to achieve faster convergence and higher performance without substantial computational costs.

Experimental Results

Extensive testing on nine image classification benchmarks underscores AutoMix's superiority. Across varied tasks and network architectures, AutoMix consistently outperformed state-of-the-art mixup techniques. Specifically, the framework demonstrated marked improvements in generalization and accuracy on typical benchmarks such as CIFAR-10/100, Tiny-ImageNet, and ImageNet-1k, and in fine-grained classification scenarios such as CUB-200 and FGVC-Aircraft. Robustness studies further validate AutoMix's ability to handle data perturbations effectively, maintaining stability against corruption and adversarial sample attacks.

Implications and Future Directions

AutoMix represents a significant stride towards enhancing mixup strategies for DNNs. Its ability to generate label-consistent samples dynamically and efficiently broadens the scope for practical applications in AI. The successful fusion of generation and classification tasks within a single framework opens pathways for exploring mixup augmentation in unsupervised and semi-supervised learning contexts.

Future research could leverage the modularity of AutoMix to extend mixup strategies beyond conventional classification tasks. Investigating its applicability in other domains, such as object detection and semantic segmentation, or exploring its potential in multimodal tasks would be valuable directions. Additionally, further refinement of the cross-attention mechanism could yield even more precise mixing procedures, enhancing its utility in complex real-world datasets.

In essence, AutoMix provides a robust and efficient solution to the challenges of mixup-based sample augmentation, with promising applications that transcend traditional boundaries in deep learning methodologies.

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.