Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
175 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Wasserstein Distance-based Expansion of Low-Density Latent Regions for Unknown Class Detection (2401.05594v3)

Published 10 Jan 2024 in cs.CV

Abstract: This paper addresses the significant challenge in open-set object detection (OSOD): the tendency of state-of-the-art detectors to erroneously classify unknown objects as known categories with high confidence. We present a novel approach that effectively identifies unknown objects by distinguishing between high and low-density regions in latent space. Our method builds upon the Open-Det (OD) framework, introducing two new elements to the loss function. These elements enhance the known embedding space's clustering and expand the unknown space's low-density regions. The first addition is the Class Wasserstein Anchor (CWA), a new function that refines the classification boundaries. The second is a spectral normalisation step, improving the robustness of the model. Together, these augmentations to the existing Contrastive Feature Learner (CFL) and Unknown Probability Learner (UPL) loss functions significantly improve OSOD performance. Our proposed OpenDet-CWA (OD-CWA) method demonstrates: a) a reduction in open-set errors by approximately 17%-22%, b) an enhancement in novelty detection capability by 1.5%-16%, and c) a decrease in the wilderness index by 2%-20% across various open-set scenarios. These results represent a substantial advancement in the field, showcasing the potential of our approach in managing the complexities of open-set object detection.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (38)
  1. The sinkhorn-knopp algorithm: Convergence and applications. SIAM Journal on Matrix Analysis and Applications, 30(1):261–15, 2008. Copyright - Copyright] © 2008 Society for Industrial and Applied Mathematics; Last updated - 2022-10-20.
  2. Wasserstein generative adversarial networks. In Proceedings of the 34th International Conference on Machine Learning, pages 214–223. PMLR, 2017.
  3. The PASCAL Classifying Heart Sounds Challenge 2011 (CHSC2011) Results. http://www.peterjbentley.com/heartchallenge/index.html.
  4. Patrick Billingsley. Probability and measure. John Wiley & Sons, 2008.
  5. Semi-Supervised Learning. The MIT Press, 2006.
  6. Domain adaptation with regularized optimal transport. In Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2014, Nancy, France, September 15-19, 2014. Proceedings, Part I 14, pages 274–289. Springer, 2014.
  7. Bernard Desgraupes. Clustering indices. 2016.
  8. The overlooked elephant of object detection: Open set. 2020 IEEE Winter Conference on Applications of Computer Vision (WACV), pages 1010–1019, 2020.
  9. Interpolating between optimal transport and mmd using sinkhorn divergences. In The 22nd International Conference on Artificial Intelligence and Statistics, pages 2681–2690, 2019.
  10. Dropout as a bayesian approximation: Representing model uncertainty in deep learning. In Proceedings of The 33rd International Conference on Machine Learning, pages 1050–1059, New York, New York, USA, 2016. PMLR.
  11. Semi-supervised learning by entropy minimization. In Advances in Neural Information Processing Systems. MIT Press, 2004.
  12. Expanding low-density latent regions for open-set object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
  13. Deep residual learning for image recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 770–778, 2016.
  14. Towards open world object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2021), 2021.
  15. Supervised contrastive learning. Advances in neural information processing systems, 33:18661–18673, 2020.
  16. Simple and scalable predictive uncertainty estimation using deep ensembles, 2017.
  17. Microsoft coco: Common objects in context. In Computer Vision – ECCV 2014, pages 740–755, Cham, 2014. Springer International Publishing.
  18. Feature pyramid networks for object detection. In CVPR, pages 936–944. IEEE Computer Society, 2017.
  19. Simple and principled uncertainty estimation with deterministic deep learning via distance awareness. In Advances in Neural Information Processing Systems, pages 7498–7512. Curran Associates, Inc., 2020.
  20. Grounding dino: Marrying dino with grounded pre-training for open-set object detection, 2023.
  21. Large-scale long-tailed recognition in an open world. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
  22. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF international conference on computer vision, pages 10012–10022, 2021.
  23. Umap: Uniform manifold approximation and projection for dimension reduction, 2020.
  24. Dropout sampling for robust object detection in open-set conditions. In 2018 IEEE International Conference on Robotics and Automation (ICRA), pages 3243–3249. IEEE, 2018.
  25. Uncertainty for identifying open-set errors in visual object detection. IEEE Robotics and Automation Letters, 7(1):215–222, 2021a.
  26. Class anchor clustering: A loss for distance-based open set recognition. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 3570–3578, 2021b.
  27. On wasserstein two-sample testing and related families of nonparametric tests. Entropy, 19(2):47, 2017.
  28. Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems, 28, 2015.
  29. Toward open set recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(7):1757–1772, 2013.
  30. Probability models for open set recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(11):2317–2324, 2014.
  31. Wasserstein distance guided representation learning for domain adaptation. In Proceedings of the AAAI Conference on Artificial Intelligence, 2018.
  32. Cédric Villani. Optimal transport – Old and new, pages xxii+973. 2008.
  33. A discriminative feature learning approach for deep face recognition. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part VII 14, pages 499–515. Springer, 2016.
  34. Detectron2. https://github.com/facebookresearch/detectron2, 2019.
  35. Open-vocabulary object detection using captions. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 14393–14402, 2021.
  36. Dino: Detr with improved denoising anchor boxes for end-to-end object detection. arXiv preprint arXiv:2203.03605, 2022.
  37. Towards open-set object detection and discovery. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3961–3970, 2022.
  38. Learning placeholders for open-set recognition. In CVPR, pages 4401–4410, 2021.
Citations (1)

Summary

We haven't generated a summary for this paper yet.