Better (pseudo-)labels for semi-supervised instance segmentation (2403.11675v1)
Abstract: Despite the availability of large datasets for tasks like image classification and image-text alignment, labeled data for more complex recognition tasks, such as detection and segmentation, is less abundant. In particular, for instance segmentation annotations are time-consuming to produce, and the distribution of instances is often highly skewed across classes. While semi-supervised teacher-student distillation methods show promise in leveraging vast amounts of unlabeled data, they suffer from miscalibration, resulting in overconfidence in frequently represented classes and underconfidence in rarer ones. Additionally, these methods encounter difficulties in efficiently learning from a limited set of examples. We introduce a dual-strategy to enhance the teacher model's training process, substantially improving the performance on few-shot learning. Secondly, we propose a calibration correction mechanism that that enables the student model to correct the teacher's calibration errors. Using our approach, we observed marked improvements over a state-of-the-art supervised baseline performance on the LVIS dataset, with an increase of 2.8% in average precision (AP) and 10.3% gain in AP for rare classes.
- Guided distillation for semi-supervised instance segmentation. In WACV, 2024.
- Cascade R-CNN: High quality object detection and instance segmentation. PAMI, 43(5):1483–1498, 2019.
- Area: adaptive reweighting via effective area for long-tailed classification. In ICCV, 2023.
- The Faiss library. arXiv preprint, 2401.08281, 2024.
- EVA: Exploring the limits of masked visual representation learning at scale. In CVPR, 2023.
- Polite teacher: Semi-supervised instance segmentation with mutual learning and pseudo-label thresholding. arXiv preprint, 2211.03850, 2022.
- LVIS: A dataset for large vocabulary instance segmentation. In CVPR, 2019.
- Mask R-CNN. In ICCV, 2017.
- Relieving long-tailed instance segmentation via pairwise class balance. In CVPR, 2022.
- Focal loss for dense object detection. In ICCV, 2017.
- When does label smoothing help? In NeurIPS, 2019.
- Obtaining well calibrated probabilities using bayesian binning. In AAAI, 2015.
- DINOv2: Learning robust visual features without supervision. arXiv preprint, 2304.07193, 2023.
- Faster R-CNN: towards real-time object detection with region proposal networks. In NeurIPS, 2015.
- Rethinking the inception architecture for computer vision. In CVPR, 2016.
- Noisy boundaries: Lemon or lemonade for semi-supervised instance segmentation? In CVPR, 2022.
- X-Paste: Revisiting scalable copy-paste for instance segmentation using CLIP and StableDiffusion. In ICML, 2023.
- DETRs with collaborative hybrid assignments training. In ICCV, 2023.
- François Porcher (2 papers)
- Camille Couprie (24 papers)
- Marc Szafraniec (10 papers)
- Jakob Verbeek (59 papers)