Point2RBox: Combine Knowledge from Synthetic Visual Patterns for End-to-end Oriented Object Detection with Single Point Supervision (2311.14758v2)
Abstract: With the rapidly increasing demand for oriented object detection (OOD), recent research involving weakly-supervised detectors for learning rotated box (RBox) from the horizontal box (HBox) has attracted more and more attention. In this paper, we explore a more challenging yet label-efficient setting, namely single point-supervised OOD, and present our approach called Point2RBox. Specifically, we propose to leverage two principles: 1) Synthetic pattern knowledge combination: By sampling around each labeled point on the image, we spread the object feature to synthetic visual patterns with known boxes to provide the knowledge for box regression. 2) Transform self-supervision: With a transformed input image (e.g. scaled/rotated), the output RBoxes are trained to follow the same transformation so that the network can perceive the relative size/rotation between objects. The detector is further enhanced by a few devised techniques to cope with peripheral issues, e.g. the anchor/layer assignment as the size of the object is not available in our point supervision setting. To our best knowledge, Point2RBox is the first end-to-end solution for point-supervised OOD. In particular, our method uses a lightweight paradigm, yet it achieves a competitive performance among point-supervised alternatives, 41.05%/27.62%/80.01% on DOTA/DIOR/HRSC datasets.
- What’s the point: Semantic segmentation with point supervision. In Computer Vision – ECCV 2016, pages 549–565, Cham, 2016. Springer International Publishing.
- End-to-end object detection with transformers. In Computer Vision – ECCV 2020, 2020.
- Points as queries: Weakly semi-supervised object detection by points. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 8819–8828, 2021a.
- Point-to-box network for accurate object detection via single point supervision. In European Conference on Computer Vision, 2022.
- You only look one-level feature. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 13034–13043, 2021b.
- Anchor-free oriented proposal generator for object detection. IEEE Transactions on Geoscience and Remote Sensing, 2022.
- Learning roi transformer for oriented object detection in aerial images. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2849–2858, 2019.
- Deep multi-modal object detection and semantic segmentation for autonomous driving: Datasets, methods, and challenges. IEEE Transactions on Intelligent Transportation Systems, 22(3):1341–1360, 2021.
- Rotation-aware and multi-scale convolutional neural network for object detection in remote sensing images. ISPRS Journal of Photogrammetry and Remote Sensing, 161:294–308, 2020.
- Weakly-supervised salient object detection using point supervision. In Proceedings of the AAAI Conference on Artificial Intelligence, pages 670–678, 2022.
- Precise detection in densely packed scenes. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5227–5236, 2019.
- Redet: A rotation-equivariant detector for aerial object detection. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2785–2794, 2021.
- Align deep features for oriented object detection. IEEE Transactions on Geoscience and Remote Sensing, 60:1–11, 2022.
- Deep residual learning for image recognition. In IEEE Conference on Computer Vision and Pattern Recognition, pages 770–778, 2016.
- G-rep: Gaussian representation for arbitrary-oriented object detection. arXiv preprint arXiv:2205.11796, 2022.
- Leveraging orientation for weakly supervised object detection with application to firearm localization. Neurocomputing, 440:310–320, 2021.
- Segment anything. arXiv:2304.02643, 2023.
- Object detection in optical remote sensing images: A survey and a new benchmark. ISPRS Journal of Photogrammetry and Remote Sensing, 159:296–307, 2020.
- Oriented reppoints for aerial object detection. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1829–1838, 2022.
- Point2mask: Point-supervised panoptic segmentation via optimal transport. In Proceedinngs of IEEE International Conference on Computer Vision, 2023.
- Rotation-sensitive regression for oriented scene text detection. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5909–5918, 2018.
- Feature pyramid networks for object detection. In IEEE Conference on Computer Vision and Pattern Recognition, pages 936–944, 2017.
- Focal loss for dense object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(2):318–327, 2020.
- Deep learning for generic object detection: A survey. International Journal of Computer Vision, 128(2):261–318, 2020a.
- Fots: Fast oriented text spotting with a unified network. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5676–5685, 2018.
- A data-flow oriented deep ensemble learning method for real-time surface defect inspection. IEEE Transactions on Instrumentation and Measurement, 69(7):4681–4691, 2020b.
- A high resolution optical satellite image dataset for ship recognition and some new baselines. In Proceedings of the International Conference on Pattern Recognition Applications and Methods, pages 324–331, 2017.
- Decoupled weight decay regularization. In International Conference on Learning Representations, 2018.
- Rtmdet: An empirical study of designing real-time object detectors, 2022.
- Arbitrary-oriented scene text detection via rotation proposals. IEEE Transactions on Multimedia, 20(11):3111–3122, 2018.
- Icdar2017 robust reading challenge on multi-lingual scene text detection and script identification - rrc-mlt. In IAPR International Conference on Document Analysis and Recognition, pages 1454–1459, 2017.
- Dynamic refinement network for oriented and densely packed object detection. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11207–11216, 2020.
- Pytorch: An imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems, pages 8024–8035, 2019.
- Learning modulated loss for rotated object detection. In Proceedings of the AAAI Conference on Artificial Intelligence, pages 2458–2466, 2021.
- Oriented object detection for remote sensing images based on weakly supervised learning. In 2021 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), pages 1–6. IEEE, 2021.
- Fcos: Fully convolutional one-stage object detection. In IEEE/CVF International Conference on Computer Vision, pages 9626–9635, 2019.
- Boxinst: High-performance instance segmentation with box annotations. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5443–5452, 2021.
- A comprehensive survey of oriented object detection in remote sensing images. Expert Systems with Applications, page 119960, 2023.
- Pcbnet: A lightweight convolutional neural network for defect inspection in surface mount technology. IEEE Transactions on Instrumentation and Measurement, 71:1–14, 2022.
- Dota: A large-scale dataset for object detection in aerial images. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3974–3983, 2018.
- Oriented r-cnn for object detection. In IEEE/CVF International Conference on Computer Vision, pages 3520–3529, 2021.
- Arbitrary-oriented object detection with circular smooth label. In European Conference on Computer Vision, pages 677–694, 2020.
- On the arbitrary-oriented object detection: Classification based approaches revisited. International Journal of Computer Vision, 130:1340–1365, 2022.
- Automatic ship detection in remote sensing images from google earth of complex scenes based on multiscale rotation dense feature pyramid networks. Remote Sensing, 10(1), 2018.
- Scrdet: Towards more robust detection for small, cluttered and rotated objects. In IEEE/CVF International Conference on Computer Vision, pages 8231–8240, 2019a.
- Dense label encoding for boundary discontinuity free rotation detection. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 15814–15824, 2021a.
- R3det: Refined single-stage detector with feature refinement for rotating object. In Proceedings of the AAAI Conference on Artificial Intelligence, pages 3163–3171, 2021b.
- Rethinking rotated object detection with gaussian wasserstein distance loss. In Proceedings of the 38th International Conference on Machine Learning, pages 11830–11841, 2021c.
- Learning high-precision bounding box for rotated object detection via kullback-leibler divergence. In Advances in Neural Information Processing Systems, pages 18381–18394, 2021d.
- H2rbox: Horizontal box annotation is all you need for oriented object detection. International Conference on Learning Representations, 2023a.
- Detecting rotated objects as gaussian distributions and its 3-d generalization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(4):4335–4354, 2023b.
- The kfiou loss for rotated object detection. In International Conference on Learning Representations, 2023c.
- Reppoints: Point set representation for object detection. In IEEE/CVF International Conference on Computer Vision, pages 9656–9665, 2019b.
- Object localization under single coarse point supervision. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 4868–4877, 2022.
- Phase-shifting coder: Predicting accurate orientation in oriented object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13354–13363, 2023.
- H2rbox-v2: Incorporating symmetry for boosting horizontal box supervised oriented object detection. In Advances in Neural Information Processing Systems, 2023.
- Group r-cnn for weakly semi-supervised object detection with points. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 9407–9416, 2022.
- Object detection with deep learning: A review. IEEE Transactions on Neural Networks and Learning Systems, 30(11):3212–3232, 2019.
- East: An efficient and accurate scene text detector. In IEEE Conference on Computer Vision and Pattern Recognition, pages 2642–2651, 2017.
- Mmrotate: A rotated object detection benchmark using pytorch. In Proceedings of the 30th ACM International Conference on Multimedia, 2022.
- Knowledge combination to learn rotated detection without rotated annotation. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023.