Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

MixupE: Understanding and Improving Mixup from Directional Derivative Perspective (2212.13381v5)

Published 27 Dec 2022 in cs.LG and cs.CV

Abstract: Mixup is a popular data augmentation technique for training deep neural networks where additional samples are generated by linearly interpolating pairs of inputs and their labels. This technique is known to improve the generalization performance in many learning paradigms and applications. In this work, we first analyze Mixup and show that it implicitly regularizes infinitely many directional derivatives of all orders. Based on this new insight, we propose an improved version of Mixup, theoretically justified to deliver better generalization performance than the vanilla Mixup. To demonstrate the effectiveness of the proposed method, we conduct experiments across various domains such as images, tabular data, speech, and graphs. Our results show that the proposed method improves Mixup across multiple datasets using a variety of architectures, for instance, exhibiting an improvement over Mixup by 0.8% in ImageNet top-1 accuracy.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (9)
  1. Yingtian Zou (12 papers)
  2. Vikas Verma (20 papers)
  3. Sarthak Mittal (21 papers)
  4. Wai Hoh Tang (3 papers)
  5. Hieu Pham (35 papers)
  6. Juho Kannala (108 papers)
  7. Yoshua Bengio (601 papers)
  8. Arno Solin (90 papers)
  9. Kenji Kawaguchi (147 papers)
Citations (7)

Summary

We haven't generated a summary for this paper yet.