Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
184 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Distilling Datasets Into Less Than One Image (2403.12040v1)

Published 18 Mar 2024 in cs.CV

Abstract: Dataset distillation aims to compress a dataset into a much smaller one so that a model trained on the distilled dataset achieves high accuracy. Current methods frame this as maximizing the distilled classification accuracy for a budget of K distilled images-per-class, where K is a positive integer. In this paper, we push the boundaries of dataset distillation, compressing the dataset into less than an image-per-class. It is important to realize that the meaningful quantity is not the number of distilled images-per-class but the number of distilled pixels-per-dataset. We therefore, propose Poster Dataset Distillation (PoDD), a new approach that distills the entire original dataset into a single poster. The poster approach motivates new technical solutions for creating training images and learnable labels. Our method can achieve comparable or better performance with less than an image-per-class compared to existing methods that use one image-per-class. Specifically, our method establishes a new state-of-the-art performance on CIFAR-10, CIFAR-100, and CUB200 using as little as 0.3 images-per-class.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (30)
  1. End-to-end incremental learning. In Proceedings of the European conference on computer vision (ECCV), pages 233–248, 2018.
  2. Dataset distillation by matching training trajectories. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 4750–4759, 2022.
  3. A visually secure image encryption scheme based on compressive sensing. Signal Processing, 134:35–51, 2017.
  4. Scaling up dataset distillation to imagenet-1k with constant memory. In International Conference on Machine Learning, pages 6565–6590. PMLR, 2023.
  5. Remember the past: Distilling datasets into addressable memories for neural networks. In Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), 2022.
  6. Vorlesungen über Zahlentheorie. English translation: Lectures on Number Theory, American Mathematical Society, 1999 ISBN 0-8218-2017-6, 1863.
  7. Minimizing the accumulated trajectory error to improve dataset distillation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
  8. Embarrassingly simple dataset distillation. In The Twelfth International Conference on Learning Representations, 2023.
  9. Dynamic few-shot visual learning without forgetting. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4367–4375, 2018.
  10. Introduction to coresets: Accurate coresets. arXiv preprint arXiv:1910.08707, 2019.
  11. Learning multiple layers of features from tiny images. 2009.
  12. Tiny imagenet visual recognition challenge. CS 231N, 7(7):3, 2015.
  13. Dataset condensation with latent space knowledge factorization and sharing. arXiv preprint arXiv:2208.10494, 2022a.
  14. Dataset condensation with contrastive signals. In International Conference on Machine Learning, pages 12352–12364. PMLR, 2022b.
  15. Efficient dataset distillation using random feature approximation. Advances in Neural Information Processing Systems, 35:13877–13891, 2022.
  16. Dataset distillation with convexified implicit gradients. arXiv preprint arXiv:2302.06755, 2023.
  17. Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th international conference on machine learning (ICML-10), pages 807–814, 2010.
  18. Dataset distillation with infinitely wide convolutional networks. Advances in Neural Information Processing Systems, 34:5186–5198, 2021.
  19. Learning transferable visual models from natural language supervision. In International conference on machine learning, pages 8748–8763. PMLR, 2021.
  20. icarl: Incremental classifier and representation learning. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pages 2001–2010, 2017.
  21. Data distillation: A survey. arXiv preprint arXiv:2301.04272, 2023.
  22. Soft-label dataset distillation and text dataset distillation. In 2021 International Joint Conference on Neural Networks (IJCNN), pages 1–8. IEEE, 2021.
  23. Instance normalization: The missing ingredient for fast stylization. arXiv preprint arXiv:1607.08022, 2016.
  24. Cafe: Learning to condense dataset by aligning features. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12196–12205, 2022.
  25. Dataset distillation. arXiv preprint arXiv:1811.10959, 2018.
  26. Caltech-ucsd birds 200. 2010.
  27. Dataset condensation with differentiable siamese augmentation. In Proceedings of the International Conference on Machine Learning (ICML), pages 12674–12685, 2021.
  28. Dataset condensation with distribution matching. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2023.
  29. Dataset condensation with gradient matching. arXiv preprint arXiv:2006.05929, 2020.
  30. Dataset distillation using neural feature regression. In Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), 2022.
Citations (1)

Summary

  • The paper presents PoDD, a novel approach that compresses full datasets into a single composite poster using less than one image per class.
  • It utilizes innovative algorithms, PoCO for semantic class ordering and PoDDL for soft label assignment, to optimize overlapping patch usage.
  • The method achieves state-of-the-art results on CIFAR-10, CIFAR-100, and CUB200 benchmarks, operating effectively at as low as 0.3 IPC.

Exploring the Frontiers of Dataset Distillation: Beyond One Image Per Class

Introduction to Poster Dataset Distillation (PoDD)

Dataset distillation, a process aimed at compressing extensive datasets into significantly smaller yet highly effective ones, has seen considerable advancements over recent years. The novel contribution we discuss herein, known as Poster Dataset Distillation (PoDD), pushes the boundaries of conventional dataset distillation techniques. Traditional approaches, constrained by the paradigm of maintaining at least one distilled image per class (IPC), have been outperformed by PoDD, which distills an entire dataset into a singular, shared "poster." This poster represents the dataset with less than 1 IPC, optimizing not merely the number of distilled images but focusing on the efficient use of pixels within a dataset.

Poster Dataset Distillation Defined

PoDD emerges as a breakthrough technique, proposing the distillation of datasets into a format significantly under one IPC. By transcending the limitations of discrete images per class, it leverages a single composite image or "poster" that encapsulates the essence of the entire dataset. This enables the shared usage of pixels among multiple classes, optimizing space and managing redundancy more effectively.

Theoretical Foundation and Methodology

PoDD redefines the conventional approach by focusing on the optimization of a "poster," employing overlapping patches to train models successfully. The method introduces two significant innovations:

  1. PoCO (Poster Class Ordering): An algorithm to semantically organize classes within the poster, optimizing the shared pixel space for closely related classes.
  2. PoDDL (Poster Dataset Distillation Labeling): A strategy for assigning soft labels to overlapping patches, ensuring each patch carries a meaningful learning signal.

These innovations allow PoDD to maintain, and in some instances surpass, the classification accuracy achieved by traditional methods using significantly fewer pixels.

Achievements and Numerical Results

PoDD establishes a new state-of-the-art in dataset distillation, demonstrating comparable or superior performance to prior methods with as little as 0.3 IPC. Specifically, on challenging benchmarks like CIFAR-10, CIFAR-100, and CUB200, PoDD not only matches but occasionally exceeds existing methods, underscoring the method's efficiency and versatility.

Theoretical Implications and Future Directions

The introduction of PoDD paves the way for a profound reevaluation of dataset distillation's possibilities. It suggests that the efficiency of distilled datasets can be significantly enhanced by sharing pixels among classes and reducing redundancy. This approach opens new avenues for research, including exploring alternative class ordering algorithms, investigating the incorporation of augmentations in distillation, and extending PoDD to support more than 1 IPC.

Concluding Remarks

PoDD's methodological innovations and demonstrated effectiveness beckon a new era in dataset distillation, emphasizing efficiency and performance. By distilling datasets into a singular poster with shared pixels among classes, PoDD not only achieves remarkable compression rates but also sets new benchmarks in classification accuracy for distilled datasets. The implications of this research, both practical and theoretical, invite further exploration into more sustainable and efficient practices in machine learning and AI development.