Dataset Condensation with Gradient Matching (2006.05929v3)

Published 10 Jun 2020 in cs.CV and cs.LG

Abstract: As the state-of-the-art machine learning methods in many fields rely on larger datasets, storing datasets and training models on them become significantly more expensive. This paper proposes a training set synthesis technique for data-efficient learning, called Dataset Condensation, that learns to condense large dataset into a small set of informative synthetic samples for training deep neural networks from scratch. We formulate this goal as a gradient matching problem between the gradients of deep neural network weights that are trained on the original and our synthetic data. We rigorously evaluate its performance in several computer vision benchmarks and demonstrate that it significantly outperforms the state-of-the-art methods. Finally we explore the use of our method in continual learning and neural architecture search and report promising gains when limited memory and computations are available.

View on arXiv

Authors (3)

Bo Zhao (242 papers)
Konda Reddy Mopuri (19 papers)
Hakan Bilen (62 papers)

Citations (394)

View on Semantic Scholar

Summary

Insights on "Dataset Condensation with Gradient Matching"

The paper "Dataset Condensation with Gradient Matching" by Bo Zhao, Konda Reddy Mopuri, and Hakan Bilen introduces a novel approach aimed at addressing the burgeoning computational demands in the field of machine learning, particularly those posed by the necessity of large datasets for training state-of-the-art models. This work proposes a method called Dataset Condensation, which seeks to compress a large dataset into a significantly smaller one while retaining the informative essence of the original data. This condensation process, formulated as a gradient matching problem, enables training models from scratch using the condensed dataset with performance metrics comparable to training on the original dataset.

Key Contributions and Methodology

The foundation of the Dataset Condensation method lies in casting the synthesis of the training set as an optimization problem. This involves matching the gradients of deep neural network weights trained on both the original and synthetic datasets. The authors' technique consists of iteratively refining synthetic samples to align the gradient signals they produce with those derived from the original data.

In terms of numerical results, the paper rigorously evaluates the proposed method across multiple computer vision benchmarks, demonstrating its superior performance over existing state-of-the-art techniques. For instance, the method achieves near-optimal results on the widely-used MNIST dataset and shows substantial gains in more complex datasets like CIFAR-10 and SVHN when compared to traditional coreset selection methods and the known Dataset Distillation technique by Wang et al. (2018).

Implications and Prospects

This research presents several implications for future developments in machine learning, particularly concerning efficiency and scalability. By significantly reducing the dataset size without compromising on the information quality necessary for training, this method can save computational resources and accelerate the training processes. This is highly beneficial for scenarios where computational resources are limited or expensive.

Moreover, the potential applications of Dataset Condensation extend beyond efficient training. The paper explores its utility in continual learning, demonstrating promising improvements in scenarios where memory constraints limit the ability to store past data. It also discusses its applicability in neural architecture search, suggesting a method to examine numerous architectures rapidly through condensed datasets, thereby identifying optimal architectures with reduced computational overhead.

Future Directions

While the current method showcases high efficacy, the exploration of its applications in more diverse datasets, such as those involving high-resolution images and extensive variations, might unveil additional insights and optimizations. Extending this method to handle such complex datasets efficiently will necessitate addressing issues related to image structure preservation and semantic consistency.

In conclusion, "Dataset Condensation with Gradient Matching" provides a substantial contribution to the ongoing discourse on efficient data utilization in machine learning. By reducing the dependency on large datasets while maintaining competitive performance standards, it opens up pathways for more sustainable and accessible machine learning practices. Future work building on this foundation might delve into extending the method’s robustness across broader applications and investigating the theoretical underpinnings of gradient matching in even more complex learning environments.

PDF Markdown

Related Papers

Find Related Papers

YouTube

Show All Videos