Weakly Semi-supervised Tool Detection in Minimally Invasive Surgery Videos (2401.02791v2)
Abstract: Surgical tool detection is essential for analyzing and evaluating minimally invasive surgery videos. Current approaches are mostly based on supervised methods that require large, fully instance-level labels (i.e., bounding boxes). However, large image datasets with instance-level labels are often limited because of the burden of annotation. Thus, surgical tool detection is important when providing image-level labels instead of instance-level labels since image-level annotations are considerably more time-efficient than instance-level annotations. In this work, we propose to strike a balance between the extremely costly annotation burden and detection performance. We further propose a co-occurrence loss, which considers a characteristic that some tool pairs often co-occur together in an image to leverage image-level labels. Encapsulating the knowledge of co-occurrence using the co-occurrence loss helps to overcome the difficulty in classification that originates from the fact that some tools have similar shapes and textures. Extensive experiments conducted on the Endovis2018 dataset in various data settings show the effectiveness of our method.
- “Tool Detection and Operative Skill Assessment in Surgical Videos Using Region-Based Convolutional Neural Networks,” in WACV, 2018.
- “Surgical Tools Detection Based on Modulated Anchoring Network in Laparoscopic Videos,” IEEE Access, 2020.
- “Detection and Localization of Robotic Tools in Robot-Assisted Surgery Videos Using Deep Neural Networks for Region Proposal and Detection,” T-MI, 2017.
- “Weakly supervised deep detection networks,” in CVPR, 2016.
- “Contextlocnet: Context-aware deep network models for weakly supervised localization,” in ECCV, 2016.
- “Pcl: Proposal cluster learning for weakly supervised object detection,” TPAMI, 2018.
- “What’s the Point: Semantic Segmentation with Point Supervision,” in ECCV, 2016.
- “Consistency-based Semi-supervised Learning for Object detection,” in NeurIPS, H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett, Eds., 2019.
- “A simple semi-supervised learning framework for object detection,” ArXiv, 2020.
- “Unbiased Teacher for Semi-Supervised Object Detection,” in ICLR, 2021.
- “Instant-teaching: An end-to-end semi-supervised object detection framework,” in CVPR, 2021.
- “Weakly-supervised learning for tool localization in laparoscopic videos,” in MICCAI, 2018.
- “A semi-supervised Teacher-Student framework for surgical tool detection and localization,” CMBBE, 2022.
- “Weakly- and Semi-Supervised Object Detection with Expectation-Maximization Algorithm,” ArXiv, 2017.
- “2018 Robotic Scene Segmentation Challenge,” 2020.
- “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks,” in NeurIPS, 2015, vol. 28.
- “Feature pyramid networks for object detection,” in CVPR, 2017.
- “Mask R-CNN,” in ICCV, 2017.
- “Attention is All you Need,” in NeurIPS, 2017.
- “Using web co-occurrence statistics for improving image categorization,” ArXiv, 2013.
- “Microsoft COCO: Common Objects in Context,” in ECCV, 2014.
- “ISINet: An Instance-Based Approach for Surgical Instrument Segmentation,” in MICCAI, 2020.
- “Scalable Joint Detection and Segmentation of Surgical Instruments with Weak Supervision,” in MICCAI, 2021.
- “Detectron2,” https://github.com/facebookresearch/detectron2, 2019.