Deflating Dataset Bias Using Synthetic Data Augmentation (2004.13866v1)
Abstract: Deep Learning has seen an unprecedented increase in vision applications since the publication of large-scale object recognition datasets and introduction of scalable compute hardware. State-of-the-art methods for most vision tasks for Autonomous Vehicles (AVs) rely on supervised learning and often fail to generalize to domain shifts and/or outliers. Dataset diversity is thus key to successful real-world deployment. No matter how big the size of the dataset, capturing long tails of the distribution pertaining to task-specific environmental factors is impractical. The goal of this paper is to investigate the use of targeted synthetic data augmentation - combining the benefits of gaming engine simulations and sim2real style transfer techniques - for filling gaps in real datasets for vision tasks. Empirical studies on three different computer vision tasks of practical use to AVs - parking slot detection, lane detection and monocular depth estimation - consistently show that having synthetic data in the training mix provides a significant boost in cross-dataset generalization performance as compared to training on real data only, for the same size of the training set.
- Nikita Jaipuria (7 papers)
- Xianling Zhang (6 papers)
- Rohan Bhasin (3 papers)
- Mayar Arafa (1 paper)
- Punarjay Chakravarty (27 papers)
- Shubham Shrivastava (15 papers)
- Sagar Manglani (4 papers)
- Vidya N. Murali (3 papers)