Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
167 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Edge Wasserstein Distance Loss for Oriented Object Detection (2312.07048v1)

Published 12 Dec 2023 in cs.CV

Abstract: Regression loss design is an essential topic for oriented object detection. Due to the periodicity of the angle and the ambiguity of width and height definition, traditional L1-distance loss and its variants have been suffered from the metric discontinuity and the square-like problem. As a solution, the distribution based methods show significant advantages by representing oriented boxes as distributions. Differing from exploited the Gaussian distribution to get analytical form of distance measure, we propose a novel oriented regression loss, Wasserstein Distance(EWD) loss, to alleviate the square-like problem. Specifically, for the oriented box(OBox) representation, we choose a specially-designed distribution whose probability density function is only nonzero over the edges. On this basis, we develop Wasserstein distance as the measure. Besides, based on the edge representation of OBox, the EWD loss can be generalized to quadrilateral and polynomial regression scenarios. Experiments on multiple popular datasets and different detectors show the effectiveness of the proposed method.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (34)
  1. Djalil Chafaï. Wasserstein distance between two gaussians. Website, 2010. https://djalil.chafai.net/blog/2010/04/30/wasserstein-distance-between-two-gaussians/.
  2. Piou loss: Towards accurate oriented object detection in complex environments. In European Conference on Computer Vision, pages 195–211. Springer, 2020.
  3. Align deep features for oriented object detection. IEEE Transactions on Geoscience and Remote Sensing, 60:1–11, 2021.
  4. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 770–778, 2016.
  5. R2cnn: rotational region cnn for orientation robust scene text detection. arXiv preprint arXiv:1706.09579, 2017.
  6. Icdar 2015 competition on robust reading. In 2015 13th International Conference on Document Analysis and Recognition, pages 1156–1160. IEEE, 2015.
  7. Real-time scene text detection with differentiable binarization. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, pages 11474–11481, 2020.
  8. Feature pyramid networks for object detection. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
  9. Focal loss for dense object detection. IEEE Transactions on Pattern Analysis & Machine Intelligence, PP(99):2999–3007, 2017.
  10. Learning a rotation invariant detector with rotatable bounding box. arXiv preprint arXiv:1711.09405, 2017.
  11. A high resolution optical satellite image dataset for ship recognition and some new baselines. In Proceedings of the International Conference on Pattern Recognition Applications and Methods, volume 2, pages 324–331, 2017.
  12. Textsnake: A flexible representation for detecting text of arbitrary shapes. In European Conference on Computer Vision, pages 20–36, 2018.
  13. Learning modulated loss for rotated object detection. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 2458–2466, 2021.
  14. Rsdet++: Point-based modulated loss for more accurate rotated object detection. arXiv preprint arXiv:2109.11906, 2021.
  15. Generalized intersection over union: A metric and a loss for bounding box regression. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 658–666, 2019.
  16. Real-time rotation-invariant face detection with progressive calibration networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2295–2303, 2018.
  17. Fcos: Fully convolutional one-stage object detection. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV), 2020.
  18. Mask obb: A semantic attention-based mask oriented bounding box representation for multi-category object detection in aerial images. Remote Sensing, 11(24):2930, 2019.
  19. Dota: A large-scale dataset for object detection in aerial images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 3974–3983, 2018.
  20. Polarmask++: Enhanced polar representation for single-shot instance segmentation and beyond. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021.
  21. Oriented r-cnn for object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 3520–3529, October 2021.
  22. Gliding vertex on the horizontal bounding box for multi-oriented object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020.
  23. Dense label encoding for boundary discontinuity free rotation detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 15819–15829, 2021.
  24. Position detection and direction prediction for arbitrary-oriented ships via multitask rotation region convolutional neural network. IEEE Access, 6:50839–50849, 2018.
  25. Arbitrary-oriented object detection with circular smooth label. In European Conference on Computer Vision, pages 677–694. Springer, 2020.
  26. Rethinking rotated object detection with gaussian wasserstein distance loss. In International Conference on Machine Learning, 2021.
  27. Scrdet: Towards more robust detection for small, cluttered and rotated objects. In Proceedings of the IEEE International Conference on Computer Vision, pages 8232–8241, 2019.
  28. Learning high-precision bounding box for rotated object detection via kullback-leibler divergence. In Advances in Neural Information Processing Systems, 2021.
  29. Alpharotate: A rotation detection benchmark using tensorflow. 2021.
  30. Detecting texts of arbitrary orientations in natural images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1083–1090. IEEE, 2012.
  31. Unitbox: An advanced object detection network. In Proceedings of the 24th ACM international conference on Multimedia, pages 516–520, 2016.
  32. Distance-iou loss: Faster and better learning for bounding box regression. In Proceedings of the AAAI Conference on Artificial Intelligence, pages 12993–13000, 2020.
  33. East: an efficient and accurate scene text detector. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 5551–5560, 2017.
  34. Mmrotate: A rotated object detection benchmark using pytorch. arXiv preprint arXiv:2204.13317, 2022.

Summary

We haven't generated a summary for this paper yet.