Dataset Distillation in Latent Space (2311.15547v1)
Abstract: Dataset distillation (DD) is a newly emerging research area aiming at alleviating the heavy computational load in training models on large datasets. It tries to distill a large dataset into a small and condensed one so that models trained on the distilled dataset can perform comparably with those trained on the full dataset when performing downstream tasks. Among the previous works in this area, there are three key problems that hinder the performance and availability of the existing DD methods: high time complexity, high space complexity, and low info-compactness. In this work, we simultaneously attempt to settle these three problems by moving the DD processes from conventionally used pixel space to latent space. Encoded by a pretrained generic autoencoder, latent codes in the latent space are naturally info-compact representations of the original images in much smaller sizes. After transferring three mainstream DD algorithms to latent space, we significantly reduce time and space consumption while achieving similar performance, allowing us to distill high-resolution datasets or target at greater data ratio that previous methods have failed. Besides, within the same storage budget, we can also quantitatively deliver more latent codes than pixel-level images, which further boosts the performance of our methods.
- Contextual diversity for active learning. In ECCV, 2020.
- Gradient based sample selection for online continual learning. In ICLR, 2019.
- Do deep nets really need to be deep? In NeurIPS, 2014.
- Flexible dataset distillation: Learn labels instead of images. In NeurIPS Workshop, 2020.
- Model compression. In KDD, 2006.
- Dataset distillation by matching training trajectories. In CVPR, 2022.
- Generalizing dataset distillation via deep generative prior. In CVPR, 2023.
- Scaling up dataset distillation to imagenet-1k with constant memory. In ICML, 2023.
- Imagenet: A large-scale hierarchical image database. In CVPR, 2009.
- Remember the past: Distilling datasets into addressable memories for neural networks. In NeurIPS, 2022.
- Diffusion models beat gans on image synthesis. In NeurIPS, 2021.
- Minimizing the accumulated trajectory error to improve dataset distillation. In CVPR, 2023.
- Turning big data into tiny data: Constant-size coresets for k-means, pca and projective clustering. In SODA, 2013.
- A survey on dataset distillation: Approaches, applications and future directions. In IJCAI, 2023.
- Dynamic few-shot visual learning without forgetting. In CVPR, 2018.
- Distilling the knowledge in a neural network. In NeurIPS Workshop, 2014.
- Denoising diffusion probabilistic models. In NeurIPS, 2020.
- Submodular combinatorial information measures with applications in machine learning. In ALT, 2021.
- Dataset condensation via efficient synthetic-data parameterization. In ICML, 2022.
- Alex Krizhevsky. Learning multiple layers of features from tiny images, 2009.
- Dataset condensation with latent space knowledge factorization and sharing. arXiv preprint arXiv:2208.00719, 2022a.
- Dataset condensation with contrastive signals. In ICML, 2022b.
- A comprehensive survey to dataset distillation. arXiv preprint arXiv:2301.05603, 2023.
- Dataset distillation using parameter pruning. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, advpub:2023EAL2053, 2023.
- Dataset distillation via factorization. In NeurIPS, 2022.
- DREAM: Efficient dataset distillation by representative matching. In ICCV, 2023.
- Efficient dataset distillation using random feature approximation. In NeurIPS, 2022.
- Dataset meta-learning from kernel ridge-regression. In ICLR, 2021a.
- Dataset distillation with infinitely wide convolutional networks. In NeurIPS, 2021b.
- High-resolution image synthesis with latent diffusion models. In CVPR, 2022.
- Fitnets: Hints for thin deep nets. arXiv preprint arXiv:1412.6550, 2015.
- Data distillation: A survey. Transactions on Machine Learning Research, 2023.
- Laion-5b: An open large-scale dataset for training next generation image-text models. In NeurIPS, 2022.
- Active learning for convolutional neural networks: A core-set approach. In ICLR, 2018.
- Denoising diffusion implicit models. In ICLR, 2021.
- Generative teaching networks: Accelerating neural architecture search by learning to generate synthetic training data. In ICML, 2020.
- An empirical study of example forgetting during deep neural network learning. In ICLR, 2019.
- CAFE: Learning to condense dataset by aligning features. In CVPR, 2022.
- Dataset distillation. arXiv preprint arXiv:1811.10959, 2018.
- Max Welling. Herding dynamical weights to learn. In ICML, 2009.
- A comprehensive survey to dataset distillation. arXiv preprint arXiv:2301.07014, 2023.
- Cutmix: Regularization strategy to train strong classifiers with localizable features. In ICCV, 2019.
- Accelerating dataset distillation via model augmentation. In CVPR, 2023.
- Dataset condensation with gradient matching. In ICLR, 2021a.
- Dataset condensation with differentiable siamese augmentation. In ICML, 2021b.
- Dataset condensation with distribution matching. In WACV, 2023.
- Improved distribution matching for dataset condensation. In CVPR, 2023.
- Dataset quantization. In ICCV, 2023.
- Dataset distillation using neural feature regression. In NeurIPS, 2022.
- Rethinking data distillation: Do not overlook calibration. In ICCV, 2023.