Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Plug and Play Active Learning for Object Detection (2211.11612v2)

Published 21 Nov 2022 in cs.CV and cs.LG

Abstract: Annotating datasets for object detection is an expensive and time-consuming endeavor. To minimize this burden, active learning (AL) techniques are employed to select the most informative samples for annotation within a constrained "annotation budget". Traditional AL strategies typically rely on model uncertainty or sample diversity for query sampling, while more advanced methods have focused on developing AL-specific object detector architectures to enhance performance. However, these specialized approaches are not readily adaptable to different object detectors due to the significant engineering effort required for integration. To overcome this challenge, we introduce Plug and Play Active Learning (PPAL), a simple and effective AL strategy for object detection. PPAL is a two-stage method comprising uncertainty-based and diversity-based sampling phases. In the first stage, our Difficulty Calibrated Uncertainty Sampling leverage a category-wise difficulty coefficient that combines both classification and localisation difficulties to re-weight instance uncertainties, from which we sample a candidate pool for the subsequent diversity-based sampling. In the second stage, we propose Category Conditioned Matching Similarity to better compute the similarities of multi-instance images as ensembles of their instance similarities, which is used by the k-Means++ algorithm to sample the final AL queries. PPAL makes no change to model architectures or detector training pipelines; hence it can be easily generalized to different object detectors. We benchmark PPAL on the MS-COCO and Pascal VOC datasets using different detector architectures and show that our method outperforms prior work by a large margin. Code is available at https://github.com/ChenhongyiYang/PPAL

Definition Search Book Streamline Icon: https://streamlinehq.com
References (54)
  1. Contextual diversity for active learning. In ECCV, 2020.
  2. Deep batch active learning by diverse, uncertain gradient lower bounds. ICLR, 2020.
  3. Link-based active learning. In NeurIPS Workshop on Analyzing Networks and Learning with Graphs, 2009.
  4. Learning a Unified Sample Weighting Network for Object Detection. In CVPR, 2020.
  5. Cascade r-cnn: Delving into high quality object detection. In CVPR, 2018.
  6. End-to-End Object Detection with Transformers. ECCV, 2020.
  7. Hybrid task cascade for instance segmentation. In CVPR, 2019.
  8. MMDetection: Open mmlab detection toolbox and benchmark. arXiv preprint arXiv:1906.07155, 2019.
  9. Disentangle your dense object detector. In ACM Multimedia, 2021.
  10. Active learning for deep object detection via probabilistic modeling. In ICCV, 2021.
  11. Batch active learning at scale. ICLR, 2021.
  12. Not all labels are equal: Rationalizing the labeling costs for training object detection. In CVPR, 2022.
  13. A convex optimization framework for active learning. In ICCV, 2013.
  14. The Pascal Visual Object Classes Challenge: A Retrospective. IJCV, 2015.
  15. Deep bayesian active learning with image data. In ICML. PMLR, 2017.
  16. Yuhong Guo. Active instance sampling via matrix partition. NeurIPS, 2010.
  17. Mask r-cnn. In ICCV, 2017.
  18. Deep residual learning for image recognition. In CVPR, 2016.
  19. DenseBox: Unifying landmark localization with end to end object detection. arXiv preprint arXiv:1509.04874, 2015.
  20. Multi-class active learning for image classification. In CVPR, 2009.
  21. The open images dataset v4. IJCV, 128(7):1956–1981, 2020.
  22. Heterogeneous uncertainty sampling for supervised learning. In Machine learning proceedings, pages 148–156. Elsevier, 1994.
  23. A sequential algorithm for training text classifiers. In SIGIR, pages 3–12. Springer, 1994.
  24. Dn-detr: Accelerate detr training by introducing query denoising. In CVPR, 2022.
  25. Focal loss for dense object detection. In ICCV, 2017.
  26. Microsoft coco: Common objects in context. In ECCV, 2014.
  27. Ssd: Single shot multibox detector. In ECCV, 2016.
  28. Swin transformer: Hierarchical vision transformer using shifted windows. In ICCV, pages 10012–10022, 2021.
  29. Latent structured active learning. NeurIPS, 2013.
  30. Box-level active detection. In CVPR, 2023.
  31. Hierarchical subquery evaluation for active learning on a graph. In CVPR, 2014.
  32. Active domain adaptation via clustering uncertainty-weighted embeddings. In ICCV, 2021.
  33. Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767, 2018.
  34. Faster r-cnn: Towards real-time object detection with region proposal networks. In NeurIPS, 2015.
  35. Generalized Intersection Over Union: A Metric and a Loss for Bounding Box Regression. In CVPR, 2019.
  36. Margin-based active learning for structured output spaces. In ECML. Springer, 2006.
  37. ImageNet Large Scale Visual Recognition Challenge. IJCV, 2015.
  38. Mobilenetv2: Inverted Residuals and Linear Bottlenecks. In CVPR, 2018.
  39. Active learning for convolutional neural networks: A core-set approach. ICLR, 2018.
  40. Burr Settles. Active learning. Synthesis lectures on artificial intelligence and machine learning, 6(1):1–114, 2012.
  41. Fcos: Fully convolutional one-stage object detection. In ICCV, 2019.
  42. Cost-effective active learning for deep image classification. T-CSVT, 27(12):2591–2600, 2016.
  43. Entropy-based active learning for object detection with progressive diversity constraint. In CVPR, 2022.
  44. Redal: Region-based and diversity-aware active learning for point cloud semantic segmentation. In ICCV, 2021.
  45. End-to-end semi-supervised object detection with soft teacher. In ICCV, pages 3060–3069, October 2021.
  46. Contrastive object-level pre-training with spatial noise curriculum learning. arXiv preprint arXiv:2111.13651, 2021.
  47. Multi-class active learning by uncertainty sampling with diversity maximization. IJCV, 113(2):113–127, 2015.
  48. Learning loss for active learning. In CVPR, 2019.
  49. Consistency-based active learning for object detection. In CVPR Workshops, 2022.
  50. Multiple instance active learning for object detection. In CVPR, 2021.
  51. Dino: Detr with improved denoising anchor boxes for end-to-end object detection. ECCV, 2022.
  52. Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection. In CVPR, 2020.
  53. Fedor Zhdanov. Diverse mini-batch active learning. arXiv preprint arXiv:1901.05954, 2019.
  54. Deformable detr: Deformable transformers for end-to-end object detection. In ICLR, 2020.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Chenhongyi Yang (14 papers)
  2. Lichao Huang (28 papers)
  3. Elliot J. Crowley (27 papers)
Citations (10)

Summary

Insightful Overview of "Plug and Play Active Learning for Object Detection"

The paper "Plug and Play Active Learning for Object Detection" addresses the pervasive challenge of annotating datasets for object detection, which are both costly and labor-intensive. The authors tackle this issue by introducing an innovative active learning (AL) strategy called Plug and Play Active Learning (PPAL), designed to be broadly applicable across various object detection architectures without necessitating changes to model architectures or training pipelines.

PPAL is distinctly characterized by its two-stage process, strategically combining uncertainty-based and diversity-based sampling techniques. The first stage introduces an enhanced form of uncertainty sampling, coined as Difficulty Calibrated Uncertainty Sampling (DCUS). This stage uniquely incorporates a category-wise difficulty coefficient that integrates classification and localization challenges into a unified uncertainty assessment strategy. This approach accommodates the complexity inherent in object detection tasks and ensures a more balanced sampling across categories by recalibrating attention towards more challenging categories, thus boosting the average precision (AP).

In the second stage, diversity-based sampling is redefined using the proposed Category Conditioned Matching Similarity (CCMS). By computing similarities of multi-instance images as ensembles based on instance similarities, this method improves sample diversity representation. The authors employ a modified k-Means++ algorithm, utilizing CCMS to effectively select a representative set of images for annotation, thereby maximizing the information gain from each annotated batch.

Benchmark results on the MS-COCO and Pascal VOC datasets underline PPAL's effectiveness. The method demonstrates significant improvement over traditional and contemporary AL strategies in object detection. Notably, PPAL maintains robust performance across different datasets and architectures—demonstrating its adaptability and efficacy. Specifically, it shows a notable advantage over competing methods, particularly during initial active learning rounds where information gain from additional annotations is critical.

The paper reinforces its findings using various backbones and settings, including RetinaNet and Faster R-CNN, validating the robust flexibility and generalization capacity of PPAL. The authors also extend their experiments to semi-supervised setups, using a semi-supervised detector, further emphasizing PPAL's applicability in diverse learning paradigms.

This research contributes significantly to the active learning domain by providing a versatile and effective method for object detection that circumvents the prohibitive costs associated with data annotation. The introduction of DCUS and CCMS not only improves AL performance but also advances theoretical understanding of uncertainty and diversity in model training. Future work could explore optimizing the hyperparameters embedded within DCUS and CCMS or integrate more nuanced difficulty assessments to further refine the selection process.

In essence, this paper provides a significant step towards efficient active learning strategies in object detection, addressing both theoretical and practical implications pertinent to AI and machine learning research communities. The broad applicability and effectiveness of PPAL mark a valuable addition to the toolkit available to researchers and practitioners facing the challenges of data annotation in object detection.

Github Logo Streamline Icon: https://streamlinehq.com