Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
184 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

On the Trade-off of Intra-/Inter-class Diversity for Supervised Pre-training (2305.12224v2)

Published 20 May 2023 in cs.LG and stat.ML

Abstract: Pre-training datasets are critical for building state-of-the-art machine learning models, motivating rigorous study on their impact on downstream tasks. In this work, we study the impact of the trade-off between the intra-class diversity (the number of samples per class) and the inter-class diversity (the number of classes) of a supervised pre-training dataset. Empirically, we found that with the size of the pre-training dataset fixed, the best downstream performance comes with a balance on the intra-/inter-class diversity. To understand the underlying mechanism, we show theoretically that the downstream performance depends monotonically on both types of diversity. Notably, our theory reveals that the optimal class-to-sample ratio (#classes / #samples per class) is invariant to the size of the pre-training dataset, which motivates an application of predicting the optimal number of pre-training classes. We demonstrate the effectiveness of this application by an improvement of around 2 points on the downstream tasks when using ImageNet as the pre-training dataset.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (27)
  1. Flamingo: a visual language model for few-shot learning. Advances in neural information processing systems, 2022.
  2. Spectrally-normalized margin bounds for neural networks. Advances in neural information processing systems, 30, 2017.
  3. Weighted training for cross-task learning. ArXiv, abs/2105.14095, 2021.
  4. ImageNet: A Large-Scale Hierarchical Image Database. In CVPR09, 2009.
  5. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020.
  6. Few-shot learning via learning the representation, provably. In International Conference on Learning Representations, 2021.
  7. The role of pre-training data in transfer learning, 2023.
  8. Datacomp: In search of the next generation of multimodal datasets. arXiv preprint arXiv:2304.14108, 2023.
  9. Tatsunori Hashimoto. Model performance scaling with multiple data sources. In International Conference on Machine Learning, 2021.
  10. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
  11. Scaling laws for transfer. ArXiv, abs/2102.01293, 2021.
  12. Towards understanding the effect of pretraining label granularity. arXiv preprint arXiv:2303.16887, 2023.
  13. What makes imagenet good for transfer learning? ArXiv, abs/1608.08614, 2016.
  14. A data-based perspective on transfer learning. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
  15. Novel dataset for fine-grained image categorization: Stanford dogs. In Proc. CVPR workshop on fine-grained visual categorization (FGVC), volume 2. Citeseer, 2011.
  16. Alex Krizhevsky. Learning multiple layers of features from tiny images. In toronto, 2009.
  17. Exploring the limits of weakly supervised pretraining. In Vittorio Ferrari, Martial Hebert, Cristian Sminchisescu, and Yair Weiss, editors, Computer Vision - ECCV 2018 - 15th European Conference, Munich, Germany, September 8-14, 2018, Proceedings, Part II, volume 11206 of Lecture Notes in Computer Science, pages 185–201. Springer, 2018.
  18. Fine-grained visual classification of aircraft. ArXiv, abs/1306.5151, 2013.
  19. Foundations of machine learning. MIT press, 2018.
  20. Automated flower classification over a large number of classes. In 2008 Sixth Indian Conference on Computer Vision, Graphics & Image Processing, pages 722–729. IEEE, 2008.
  21. Moment matching for multi-source domain adaptation. In Proceedings of the IEEE/CVF international conference on computer vision, pages 1406–1415, 2019.
  22. Recognizing indoor scenes. In 2009 IEEE conference on computer vision and pattern recognition, pages 413–420. IEEE, 2009.
  23. Learning transferable visual models from natural language supervision. In International Conference on Machine Learning, pages 8748–8763. PMLR, 2021.
  24. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV), 115(3):211–252, 2015.
  25. Human action recognition by learning bases of action attributes and parts. In 2011 International conference on computer vision, pages 1331–1338. IEEE, 2011.
  26. Blessing of class diversity in pre-training. International Conference on Artificial Intelligence and Statistics, 2022.
  27. Places: A 10 million image database for scene recognition. IEEE transactions on pattern analysis and machine intelligence, 40(6):1452–1464, 2017.
Citations (8)

Summary

We haven't generated a summary for this paper yet.