Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Benchmarking Multi-Domain Active Learning on Image Classification (2312.00364v1)

Published 1 Dec 2023 in cs.LG and cs.CV

Abstract: Active learning aims to enhance model performance by strategically labeling informative data points. While extensively studied, its effectiveness on large-scale, real-world datasets remains underexplored. Existing research primarily focuses on single-source data, ignoring the multi-domain nature of real-world data. We introduce a multi-domain active learning benchmark to bridge this gap. Our benchmark demonstrates that traditional single-domain active learning strategies are often less effective than random selection in multi-domain scenarios. We also introduce CLIP-GeoYFCC, a novel large-scale image dataset built around geographical domains, in contrast to existing genre-based domain datasets. Analysis on our benchmark shows that all multi-domain strategies exhibit significant tradeoffs, with no strategy outperforming across all datasets or all metrics, emphasizing the need for future research.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (44)
  1. Active domain adaptation via clustering uncertainty-weighted embeddings. In Proceedings of the IEEE International Conference on Computer Vision, 2021.
  2. A survey of deep active learning. ACM Computing Surveys, 54, 2022.
  3. Active sampling for min-max fairness. In Proceedings of Machine Learning Research, 2022.
  4. Adaptive sampling to reduce disparate performance. CoRR, abs/2006.06879, 2020.
  5. Fair active learning. Expert Systems with Applications, 199, 2022.
  6. Deep batch active learning by diverse, uncertain gradient lower bounds. In 8th International Conference on Learning Representations, ICLR 2020, 2020.
  7. Is margin all you need? an extensive empirical study of active learning on tabular data, 2022.
  8. Multinomial adversarial networks for multi-domain text classification. arXiv preprint arXiv:1802.05694, 2018.
  9. Batch active learning at scale. In Advances in Neural Information Processing Systems, 2021.
  10. Active learning for bert: an empirical study. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 7949–7962, 2020.
  11. Adaptive methods for real-world domain generalization. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2021.
  12. Active learning at the imagenet scale, 2021.
  13. Unsupervised domain adaptation by backpropagation. In International conference on machine learning, pages 1180–1189. PMLR, 2015.
  14. Domain-adversarial training of neural networks. The journal of machine learning research, 17(1):2096–2030, 2016.
  15. Leveraging unlabeled data to predict out-of-distribution performance. arXiv preprint arXiv:2201.04234, 2022.
  16. Discriminative active learning, 2019.
  17. Geodesic flow kernel for unsupervised domain adaptation. In 2012 IEEE Conference on Computer Vision and Pattern Recognition, pages 2066–2073, 2012.
  18. Leveraging hierarchical structure for multi-domain active learning with theoretical guarantees, 2023.
  19. Multi-domain active learning: Literature review and comparative study, 2022.
  20. What makes imagenet good for transfer learning?, 2016.
  21. Multi-class active learning for image classification. In 2009 ieee conference on computer vision and pattern recognition, pages 2372–2379. IEEE, 2009.
  22. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998.
  23. David D Lewis. A sequential algorithm for training text classifiers: Corrigendum and additional data. In Acm Sigir Forum, pages 13–19. ACM New York, NY, USA, 1995.
  24. Heterogeneous uncertainty sampling for supervised learning. In Machine learning proceedings 1994, pages 148–156. Elsevier, 1994.
  25. Minimax pareto fairness: A multi objective perspective. In 37th International Conference on Machine Learning, ICML 2020, 2020.
  26. On the relationship between data efficiency and error for uncertainty sampling. In 35th International Conference on Machine Learning, ICML 2018, 2018.
  27. On the importance of adaptive data collection for extremely imbalanced pairwise tasks. In Findings of the Association for Computational Linguistics Findings of ACL: EMNLP 2020, 2020.
  28. Learning multi-domain convolutional neural networks for visual tracking. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4293–4302, 2016.
  29. Reading digits in natural images with unsupervised feature learning. In NIPS Workshop on Deep Learning and Unsupervised Feature Learning 2011, 2011.
  30. Moment matching for multi-source domain adaptation. In Proceedings of the IEEE International Conference on Computer Vision, 2019.
  31. Learning transferable visual models from natural language supervision. In International conference on machine learning, pages 8748–8763. PMLR, 2021.
  32. A survey of deep active learning. ACM computing surveys (CSUR), 54(9):1–40, 2021.
  33. Margin-based active learning for structured output spaces. In Machine Learning: ECML 2006: 17th European Conference on Machine Learning Berlin, Germany, September 18-22, 2006 Proceedings 17, pages 413–424. Springer, 2006.
  34. Imagenet large scale visual recognition challenge, 2015.
  35. Active learning for convolutional neural networks: A core-set approach. In 6th International Conference on Learning Representations, ICLR 2018 - Conference Track Proceedings, 2018.
  36. Burr Settles. Active learning literature survey. Machine Learning, 15, 2010.
  37. Promoting fairness in learned models by learning to active learn under parity constraints. In ACM International Conference Proceeding Series, 2022.
  38. Adaptive sampling for minimax fair classification. In Advances in Neural Information Processing Systems, 2021.
  39. Active adversarial domain adaptation. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 739–748, 2020.
  40. Active learning helps pretrained models learn the intended task. In Advances in Neural Information Processing Systems, 2022.
  41. Yfcc100m: The new data in multimedia research. Communications of the ACM, 59(2):64–73, 2016.
  42. Deep hashing network for unsupervised domain adaptation. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 5385–5394. IEEE Computer Society, 2017.
  43. A new active labeling method for deep learning. In 2014 International joint conference on neural networks (IJCNN), pages 112–119. IEEE, 2014.
  44. Conditional adversarial networks for multi-domain text classification. arXiv preprint arXiv:2102.10176, 2021.

Summary

We haven't generated a summary for this paper yet.