Benchmarking Multi-Domain Active Learning on Image Classification (2312.00364v1)

Published 1 Dec 2023 in cs.LG and cs.CV

Abstract: Active learning aims to enhance model performance by strategically labeling informative data points. While extensively studied, its effectiveness on large-scale, real-world datasets remains underexplored. Existing research primarily focuses on single-source data, ignoring the multi-domain nature of real-world data. We introduce a multi-domain active learning benchmark to bridge this gap. Our benchmark demonstrates that traditional single-domain active learning strategies are often less effective than random selection in multi-domain scenarios. We also introduce CLIP-GeoYFCC, a novel large-scale image dataset built around geographical domains, in contrast to existing genre-based domain datasets. Analysis on our benchmark shows that all multi-domain strategies exhibit significant tradeoffs, with no strategy outperforming across all datasets or all metrics, emphasizing the need for future research.

References (44)

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Benchmarking Multi-Domain Active Learning on Image Classification (2312.00364v1)

Summary

Related Papers