Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Does Data Augmentation Lead to Positive Margin? (1905.03177v1)

Published 8 May 2019 in cs.LG and stat.ML

Abstract: Data augmentation (DA) is commonly used during model training, as it significantly improves test error and model robustness. DA artificially expands the training set by applying random noise, rotations, crops, or even adversarial perturbations to the input data. Although DA is widely used, its capacity to provably improve robustness is not fully understood. In this work, we analyze the robustness that DA begets by quantifying the margin that DA enforces on empirical risk minimizers. We first focus on linear separators, and then a class of nonlinear models whose labeling is constant within small convex hulls of data points. We present lower bounds on the number of augmented data points required for non-zero margin, and show that commonly used DA techniques may only introduce significant margin after adding exponentially many points to the data set.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Shashank Rajput (17 papers)
  2. Zhili Feng (22 papers)
  3. Zachary Charles (33 papers)
  4. Po-Ling Loh (43 papers)
  5. Dimitris Papailiopoulos (59 papers)
Citations (37)

Summary

We haven't generated a summary for this paper yet.