Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Deflating Dataset Bias Using Synthetic Data Augmentation (2004.13866v1)

Published 28 Apr 2020 in cs.CV, cs.LG, and eess.IV

Abstract: Deep Learning has seen an unprecedented increase in vision applications since the publication of large-scale object recognition datasets and introduction of scalable compute hardware. State-of-the-art methods for most vision tasks for Autonomous Vehicles (AVs) rely on supervised learning and often fail to generalize to domain shifts and/or outliers. Dataset diversity is thus key to successful real-world deployment. No matter how big the size of the dataset, capturing long tails of the distribution pertaining to task-specific environmental factors is impractical. The goal of this paper is to investigate the use of targeted synthetic data augmentation - combining the benefits of gaming engine simulations and sim2real style transfer techniques - for filling gaps in real datasets for vision tasks. Empirical studies on three different computer vision tasks of practical use to AVs - parking slot detection, lane detection and monocular depth estimation - consistently show that having synthetic data in the training mix provides a significant boost in cross-dataset generalization performance as compared to training on real data only, for the same size of the training set.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Nikita Jaipuria (7 papers)
  2. Xianling Zhang (6 papers)
  3. Rohan Bhasin (3 papers)
  4. Mayar Arafa (1 paper)
  5. Punarjay Chakravarty (27 papers)
  6. Shubham Shrivastava (15 papers)
  7. Sagar Manglani (4 papers)
  8. Vidya N. Murali (3 papers)
Citations (55)

Summary

We haven't generated a summary for this paper yet.