Evaluation of Environmental Conditions on Object Detection using Oriented Bounding Boxes for AR Applications (2306.16798v2)
Abstract: The objective of augmented reality (AR) is to add digital content to natural images and videos to create an interactive experience between the user and the environment. Scene analysis and object recognition play a crucial role in AR, as they must be performed quickly and accurately. In this study, a new approach is proposed that involves using oriented bounding boxes with a detection and recognition deep network to improve performance and processing time. The approach is evaluated using two datasets: a real image dataset (DOTA dataset) commonly used for computer vision tasks, and a synthetic dataset that simulates different environmental, lighting, and acquisition conditions. The focus of the evaluation is on small objects, which are difficult to detect and recognise. The results indicate that the proposed approach tends to produce better Average Precision and greater accuracy for small objects in most of the tested conditions.
- Photometric stereo with an arbitrary number of illuminants. Computer Vision and Image Understanding, 114(8):887–900, 2010. ISSN 1077-3142. doi:https://doi.org/10.1016/j.cviu.2010.05.002. URL https://www.sciencedirect.com/science/article/pii/S1077314210001335.
- Overview of augmented reality technology. Computer Knowledge and Technology, 34:194–196, 2017.
- Vasileios Argyriou. Sub-hexagonal phase correlation for motion estimation. IEEE Transactions on Image Processing, 20:110–120, 2011.
- On the estimation of subpixel motion using phase correlation. Journal of Electronic Imaging, 16(3):033018, 2007. doi:10.1117/1.2762230. URL https://doi.org/10.1117/1.2762230.
- V. Argyriou. Performance study of gradient correlation for sub-pixel motion estimation in the frequency domain. IEE Proceedings - Vision, Image and Signal Processing, 152:107–114(7), February 2005. ISSN 1350-245X. URL https://digital-library.theiet.org/content/journals/10.1049/ip-vis_20051073.
- Clustered spatio-temporal manifolds for online action recognition. In 2014 22nd International Conference on Pattern Recognition, pages 3963–3968, 2014. doi:10.1109/ICPR.2014.679.
- Augmented reality and virtual reality displays: emerging technologies and future perspectives. Light: Science & Applications, 10(1):216, 2021.
- Machine learning architectures to predict motion sickness using a virtual reality rollercoaster simulation tool. In 2018 IEEE International Conference on Artificial Intelligence and Virtual Reality (AIVR), pages 153–156, 2018. doi:10.1109/AIVR.2018.00032.
- Vision-based autonomous vehicle systems based on deep learning: A systematic literature review. Applied Sciences, 12(14):6831, 2022.
- Jieping ye. object detection in 20 years: A survey. arXiv preprint arXiv:1905.05055, 2, 2019.
- Overfeat: Integrated recognition, localization and detection using convolutional networks. arXiv preprint arXiv:1312.6229, 2013.
- Feasibility study on the utilization of microsoft hololens to increase driving conditions awareness. In 2019 SoutheastCon, pages 1–8. IEEE, 2019.
- Augmented visualization using homomorphic filtering and haar-based natural markers for power systems substations. Computers in Industry, 97:67–75, 2018.
- Operator support in human–robot collaborative environments using ai enhanced wearable devices. Procedia Cirp, 97:464–469, 2021.
- Object detection with deep learning: A review. IEEE transactions on neural networks and learning systems, 30(11):3212–3232, 2019.
- Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998.
- Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 580–587, 2014.
- Rapid object detection using a boosted cascade of simple features. In Proceedings of the 2001 IEEE computer society conference on computer vision and pattern recognition. CVPR 2001, volume 1, pages I–I. Ieee, 2001.
- Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE transactions on pattern analysis and machine intelligence, 37(9):1904–1916, 2015.
- Ross Girshick. Fast r-cnn. In Proceedings of the IEEE international conference on computer vision, pages 1440–1448, 2015.
- Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems, 28, 2015.
- R-fcn: Object detection via region-based fully convolutional networks. Advances in neural information processing systems, 29, 2016.
- Light-head r-cnn: In defense of two-stage object detector. arXiv preprint arXiv:1711.07264, 2017.
- Feature pyramid networks for object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2117–2125, 2017a.
- D2det: Towards high quality object detection and instance segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 11485–11494, 2020.
- You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 779–788, 2016.
- Yolo9000: better, faster, stronger. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 7263–7271, 2017.
- Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767, 2018.
- Ssd: Single shot multibox detector. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14, pages 21–37. Springer, 2016.
- Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision, pages 2980–2988, 2017b.
- Dota: A large-scale dataset for object detection in aerial images. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3974–3983, 2018.