Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
126 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Human-in-the-Loop Segmentation of Multi-species Coral Imagery (2404.09406v3)

Published 15 Apr 2024 in cs.CV, cs.HC, cs.LG, and cs.RO

Abstract: Marine surveys by robotic underwater and surface vehicles result in substantial quantities of coral reef imagery, however labeling these images is expensive and time-consuming for domain experts. Point label propagation is a technique that uses existing images labeled with sparse points to create augmented ground truth data, which can be used to train a semantic segmentation model. In this work, we show that recent advances in large foundation models facilitate the creation of augmented ground truth masks using only features extracted by the denoised version of the DINOv2 foundation model and K-Nearest Neighbors (KNN), without any pre-training. For images with extremely sparse labels, we present a labeling method based on human-in-the-loop principles, which greatly enhances annotation efficiency: in the case that there are 5 point labels per image, our human-in-the-loop method outperforms the prior state-of-the-art by 14.2% for pixel accuracy and 19.7% for mIoU; and by 8.9% and 18.3% if there are 10 point labels. When human-in-the-loop labeling is not available, using the denoised DINOv2 features with a KNN still improves on the prior state-of-the-art by 2.7% for pixel accuracy and 5.8% for mIoU (5 grid points). On the semantic segmentation task, we outperform the prior state-of-the-art by 8.8% for pixel accuracy and by 13.5% for mIoU when only 5 point labels are used for point label propagation. Additionally, we perform a comprehensive study into the impacts of the point label placement style and the number of points on the point label propagation quality, and make several recommendations for improving the efficiency of labeling images with points.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (8)
  1. Semantic segmentation from sparse labeling using multi-level superpixels. In IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 5785–5792, 2018.
  2. CoralSeg: Learning coral segmentation from sparse annotations. Journal of Field Robotics, 36(8):1456–1477, 2019.
  3. Vision transformers need registers. arXiv preprint arXiv:2309.16588, 2023.
  4. Observational methods used in marine spatial monitoring of fishes and associated habitats: A review. Marine and Freshwater Research, 61(2):236–252, 2010.
  5. DINOv2: Learning robust visual features without supervision. arXiv preprint arXiv:2304.07193, 2023.
  6. Reducing annotation times: Semantic segmentation of coral reef survey images. In Global Oceans, pages 1–9, 2020.
  7. Point label aware superpixels for multi-species segmentation of underwater imagery. IEEE Robotics and Automation Letters, 7(3):8291–8298, 2022.
  8. Denoising vision transformers. arXiv preprint arXiv:2401.02957, 2024.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com