A Density-Guided Temporal Attention Transformer for Indiscernible Object Counting in Underwater Video (2403.03461v1)
Abstract: Dense object counting or crowd counting has come a long way thanks to the recent development in the vision community. However, indiscernible object counting, which aims to count the number of targets that are blended with respect to their surroundings, has been a challenge. Image-based object counting datasets have been the mainstream of the current publicly available datasets. Therefore, we propose a large-scale dataset called YoutubeFish-35, which contains a total of 35 sequences of high-definition videos with high frame-per-second and more than 150,000 annotated center points across a selected variety of scenes. For benchmarking purposes, we select three mainstream methods for dense object counting and carefully evaluate them on the newly collected dataset. We propose TransVidCount, a new strong baseline that combines density and regression branches along the temporal domain in a unified framework and can effectively tackle indiscernible object counting with state-of-the-art performance on YoutubeFish-35 dataset.
- “Simultaneously localize, segment and rank the camouflaged objects,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 11591–11601.
- “Concealed object detection,” IEEE transactions on pattern analysis and machine intelligence, vol. 44, no. 10, pp. 6024–6042, 2021.
- “Indiscernible object counting in underwater scenes,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 13791–13801.
- “Privacy preserving crowd monitoring: Counting people without people models or tracking,” in 2008 IEEE conference on computer vision and pattern recognition. IEEE, 2008, pp. 1–7.
- “Feature mining for localised crowd counting.,” in 2012 British Machine Vision Conference, 2012, vol. 1, p. 3.
- “Optical flow dataset and benchmark for visual crowd analysis,” in 2018 15th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS). IEEE, 2018, pp. 1–6.
- “Locality-constrained spatial transformer network for video crowd counting,” in 2019 IEEE international conference on multimedia and expo (ICME). IEEE, 2019, pp. 814–819.
- “Automatic fish population counting by machine vision and a hybrid deep neural network model,” Animals, vol. 10, no. 2, pp. 364, 2020.
- “Counting in the wild,” in Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part VII 14. Springer, 2016, pp. 483–498.
- “Counting people by clustering person detector outputs,” in 2014 11th IEEE international conference on advanced video and signal based surveillance (AVSS). IEEE, 2014, pp. 313–318.
- “An end-to-end transformer model for crowd localization,” in European Conference on Computer Vision. Springer, 2022, pp. 38–54.
- “Boosting crowd counting via multifaceted attention,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 19628–19637.
- “Spatiotemporal modeling for crowd counting in videos,” in Proceedings of the IEEE international conference on computer vision, 2017, pp. 5151–5159.
- “Video-based crowd counting using a multi-scale optical flow pyramid network,” in Proceedings of the Asian Conference on Computer Vision, 2020.
- “Spatial-temporal graph network for video crowd counting,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 33, no. 1, pp. 228–241, 2022.
- “End-to-end object detection with transformers,” in European conference on computer vision. Springer, 2020, pp. 213–229.
- “Conditional detr for fast training convergence,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 3651–3660.
- “Focal loss for dense object detection,” in Proceedings of the IEEE international conference on computer vision, 2017, pp. 2980–2988.
Sponsor
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.