Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Exploiting Inter-sample and Inter-feature Relations in Dataset Distillation (2404.00563v1)

Published 31 Mar 2024 in cs.CV

Abstract: Dataset distillation has emerged as a promising approach in deep learning, enabling efficient training with small synthetic datasets derived from larger real ones. Particularly, distribution matching-based distillation methods attract attention thanks to its effectiveness and low computational cost. However, these methods face two primary limitations: the dispersed feature distribution within the same class in synthetic datasets, reducing class discrimination, and an exclusive focus on mean feature consistency, lacking precision and comprehensiveness. To address these challenges, we introduce two novel constraints: a class centralization constraint and a covariance matching constraint. The class centralization constraint aims to enhance class discrimination by more closely clustering samples within classes. The covariance matching constraint seeks to achieve more accurate feature distribution matching between real and synthetic datasets through local feature covariance matrices, particularly beneficial when sample sizes are much smaller than the number of features. Experiments demonstrate notable improvements with these constraints, yielding performance boosts of up to 6.6% on CIFAR10, 2.9% on SVHN, 2.5% on CIFAR100, and 2.5% on TinyImageNet, compared to the state-of-the-art relevant methods. In addition, our method maintains robust performance in cross-architecture settings, with a maximum performance drop of 1.7% on four architectures. Code is available at https://github.com/VincenDen/IID.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (42)
  1. Gradient based sample selection for online continual learning. Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), 32, 2019.
  2. Flexible dataset distillation: Learn labels instead of images. Proceedings of the Advances in Neural Information Processing Systems (NeurIPS Workshop), 2020.
  3. No free lunch in” privacy for free: How does dataset condensation help privacy”. Proceedings of the International Conference on Machine Learning (ICML), 2022.
  4. End-to-end incremental learning. In Proceedings of the European conference on computer vision (ECCV), pages 233–248, 2018.
  5. Dataset distillation by matching training trajectories. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 4750–4759, 2022.
  6. Dc-bench: Dataset condensation benchmark. Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), 35:810–822, 2022.
  7. Imagenet: A large-scale hierarchical image database. In Proceedings of the IEEE/CVF International Conference on Computer Vision (CVPR), pages 248–255, 2009.
  8. Remember the past: Distilling datasets into addressable memories for neural networks. Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), 35:34391–34404, 2022.
  9. Privacy for free: How does dataset condensation help privacy? In Proceedings of the International Conference on Machine Learning (ICML), pages 5378–5396, 2022.
  10. An image is worth 16x16 words: Transformers for image recognition at scale. Proceedings of the International Conference on Learning Representations (ICLR), 2020.
  11. Minimizing the accumulated trajectory error to improve dataset distillation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 3749–3758, 2023.
  12. Facility location: concepts, models, algorithms and case studies. Springer Science & Business Media, 2009.
  13. Knowledge distillation: A survey. International Journal of Computer Vision, 129:1789–1819, 2021.
  14. Deep residual learning for image recognition. In Proceedings of the IEEE/CVF International Conference on Computer Vision (CVPR), pages 770–778, 2016.
  15. Densely connected convolutional networks. In Proceedings of the IEEE/CVF International Conference on Computer Vision (CVPR), pages 4700–4708, 2017.
  16. Learning multiple layers of features from tiny images. 2009.
  17. Tiny imagenet visual recognition challenge. CS 231N, 7(7):3, 2015.
  18. Dataset condensation with latent space knowledge factorization and sharing. arXiv preprint arXiv:2208.10494, 2022a.
  19. Dataset condensation with contrastive signals. In Proceedings of the International Conference on Machine Learning (ICML), pages 12352–12364, 2022b.
  20. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision (CVPR), pages 10012–10022, 2021.
  21. Dataset distillation with infinitely wide convolutional networks. Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), 34:5186–5198, 2021.
  22. icarl: Incremental classifier and representation learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision (CVPR), pages 2001–2010, 2017.
  23. You only look once: Unified, real-time object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision (CVPR), pages 779–788, 2016.
  24. Distilled replay: Overcoming forgetting through synthetic samples. In International Workshop on Continual Semi-Supervised Learning, pages 104–117, 2021.
  25. Datadam: Efficient dataset distillation with attention matching. In Proceedings of the IEEE/CVF International Conference on Computer Vision (CVPR), pages 17097–17107, 2023.
  26. Sample condensation in online continual learning. In International Joint Conference on Neural Networks (IJCNN), pages 01–08, 2022.
  27. Active learning for convolutional neural networks: A core-set approach. In Proceedings of the International Conference on Learning Representations (ICLR), 2018.
  28. Convolutional neural networks applied to house numbers digit classification. In Proceedings of the 21st international conference on pattern recognition (ICPR2012), pages 3288–3291, 2012.
  29. Very deep convolutional networks for large-scale image recognition. Proceedings of the IEEE/CVF International Conference on Computer Vision (CVPR), 2014.
  30. An empirical study of example forgetting during deep neural network learning. Proceedings of the International Conference on Learning Representations (ICLR), 2018.
  31. Learning to adapt structured output space for semantic segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision (CVPR), pages 7472–7481, 2018.
  32. Cafe: Learning to condense dataset by aligning features. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 12196–12205, 2022.
  33. Dataset distillation. arXiv preprint arXiv:1811.10959, 2018.
  34. Max Welling. Herding dynamical weights to learn. In Proceedings of the International Conference on Machine Learning (ICML), pages 1121–1128, 2009.
  35. Adaptive deep models for incremental learning: Considering capacity scalability and sustainability. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 74–82, Anchorage, AK, 2019.
  36. Learning adaptive embedding considering incremental class. IEEE Trans. Knowl. Data Eng., 35(3):2736–2749, 2023.
  37. Dataset condensation with differentiable siamese augmentation. In Proceedings of the International Conference on Machine Learning (ICML), pages 12674–12685, 2021.
  38. Synthesizing informative training samples with gan. Proceedings of the Advances in Neural Information Processing Systems (NeurIPS Workshop), 2022.
  39. Dataset condensation with distribution matching. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (CVPR), pages 6514–6523, 2023.
  40. Dataset condensation with gradient matching. 2021.
  41. Improved distribution matching for dataset condensation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 7856–7865, 2023.
  42. Differentially private dataset condensation. 2022.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Wenxiao Deng (1 paper)
  2. Wenbin Li (117 papers)
  3. Tianyu Ding (36 papers)
  4. Lei Wang (975 papers)
  5. Hongguang Zhang (36 papers)
  6. Kuihua Huang (8 papers)
  7. Jing Huo (45 papers)
  8. Yang Gao (761 papers)
Citations (7)

Summary

We haven't generated a summary for this paper yet.