Distilling Datasets Into Less Than One Image (2403.12040v1)
Abstract: Dataset distillation aims to compress a dataset into a much smaller one so that a model trained on the distilled dataset achieves high accuracy. Current methods frame this as maximizing the distilled classification accuracy for a budget of K distilled images-per-class, where K is a positive integer. In this paper, we push the boundaries of dataset distillation, compressing the dataset into less than an image-per-class. It is important to realize that the meaningful quantity is not the number of distilled images-per-class but the number of distilled pixels-per-dataset. We therefore, propose Poster Dataset Distillation (PoDD), a new approach that distills the entire original dataset into a single poster. The poster approach motivates new technical solutions for creating training images and learnable labels. Our method can achieve comparable or better performance with less than an image-per-class compared to existing methods that use one image-per-class. Specifically, our method establishes a new state-of-the-art performance on CIFAR-10, CIFAR-100, and CUB200 using as little as 0.3 images-per-class.
- End-to-end incremental learning. In Proceedings of the European conference on computer vision (ECCV), pages 233–248, 2018.
- Dataset distillation by matching training trajectories. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 4750–4759, 2022.
- A visually secure image encryption scheme based on compressive sensing. Signal Processing, 134:35–51, 2017.
- Scaling up dataset distillation to imagenet-1k with constant memory. In International Conference on Machine Learning, pages 6565–6590. PMLR, 2023.
- Remember the past: Distilling datasets into addressable memories for neural networks. In Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), 2022.
- Vorlesungen über Zahlentheorie. English translation: Lectures on Number Theory, American Mathematical Society, 1999 ISBN 0-8218-2017-6, 1863.
- Minimizing the accumulated trajectory error to improve dataset distillation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
- Embarrassingly simple dataset distillation. In The Twelfth International Conference on Learning Representations, 2023.
- Dynamic few-shot visual learning without forgetting. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4367–4375, 2018.
- Introduction to coresets: Accurate coresets. arXiv preprint arXiv:1910.08707, 2019.
- Learning multiple layers of features from tiny images. 2009.
- Tiny imagenet visual recognition challenge. CS 231N, 7(7):3, 2015.
- Dataset condensation with latent space knowledge factorization and sharing. arXiv preprint arXiv:2208.10494, 2022a.
- Dataset condensation with contrastive signals. In International Conference on Machine Learning, pages 12352–12364. PMLR, 2022b.
- Efficient dataset distillation using random feature approximation. Advances in Neural Information Processing Systems, 35:13877–13891, 2022.
- Dataset distillation with convexified implicit gradients. arXiv preprint arXiv:2302.06755, 2023.
- Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th international conference on machine learning (ICML-10), pages 807–814, 2010.
- Dataset distillation with infinitely wide convolutional networks. Advances in Neural Information Processing Systems, 34:5186–5198, 2021.
- Learning transferable visual models from natural language supervision. In International conference on machine learning, pages 8748–8763. PMLR, 2021.
- icarl: Incremental classifier and representation learning. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pages 2001–2010, 2017.
- Data distillation: A survey. arXiv preprint arXiv:2301.04272, 2023.
- Soft-label dataset distillation and text dataset distillation. In 2021 International Joint Conference on Neural Networks (IJCNN), pages 1–8. IEEE, 2021.
- Instance normalization: The missing ingredient for fast stylization. arXiv preprint arXiv:1607.08022, 2016.
- Cafe: Learning to condense dataset by aligning features. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12196–12205, 2022.
- Dataset distillation. arXiv preprint arXiv:1811.10959, 2018.
- Caltech-ucsd birds 200. 2010.
- Dataset condensation with differentiable siamese augmentation. In Proceedings of the International Conference on Machine Learning (ICML), pages 12674–12685, 2021.
- Dataset condensation with distribution matching. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2023.
- Dataset condensation with gradient matching. arXiv preprint arXiv:2006.05929, 2020.
- Dataset distillation using neural feature regression. In Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), 2022.