Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
162 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Robust Zero-Shot Crowd Counting and Localization With Adaptive Resolution SAM (2402.17514v2)

Published 27 Feb 2024 in cs.CV

Abstract: The existing crowd counting models require extensive training data, which is time-consuming to annotate. To tackle this issue, we propose a simple yet effective crowd counting method by utilizing the Segment-Everything-Everywhere Model (SEEM), an adaptation of the Segmentation Anything Model (SAM), to generate pseudo-labels for training crowd counting models. However, our initial investigation reveals that SEEM's performance in dense crowd scenes is limited, primarily due to the omission of many persons in high-density areas. To overcome this limitation, we propose an adaptive resolution SEEM to handle the scale variations, occlusions, and overlapping of people within crowd scenes. Alongside this, we introduce a robust localization method, based on Gaussian Mixture Models, for predicting the head positions in the predicted people masks. Given the mask and point pseudo-labels, we propose a robust loss function, which is designed to exclude uncertain regions based on SEEM's predictions, thereby enhancing the training process of the counting networks. Finally, we propose an iterative method for generating pseudo-labels. This method aims at improving the quality of the segmentation masks by identifying more tiny persons in high-density regions, which are often missed in the first pseudo-labeling stage. Overall, our proposed method achieves the best unsupervised performance in crowd counting, while also being comparable results to some supervised methods. This makes it a highly effective and versatile tool for crowd counting, especially in situations where labeled data is not available.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (48)
  1. Switching convolutional neural network for crowd counting. In CVPR, pages 5744–5752, 2017.
  2. Completely self-supervised crowd counting via distribution matching. In ECCV, pages 186–204. Springer, 2022.
  3. Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE TPAMI, 39(12):2481–2495, 2017.
  4. Bayesian poisson regression for crowd counting. In CVPR, pages 545–551. IEEE, 2009.
  5. Privacy preserving crowd monitoring: Counting people without people models or tracking. In CVPR, pages 1–7. IEEE, 2008.
  6. From semi-supervised to transfer counting of crowds. In ICCV, pages 2256–2263, 2013.
  7. Marked point processes for crowd counting. In CVPR, pages 2913–2920. IEEE, 2009.
  8. Steerer: Resolving scale variations for counting and localization via selective inheritance learning. In ICCV, pages 21848–21859, 2023.
  9. Deep residual learning for image recognition. In CVPR, pages 770–778, 2016.
  10. Learning to count anything: Reference-less class-agnostic counting with weak supervision. arXiv preprint arXiv:2205.10203, 2022.
  11. Densely connected convolutional networks. In CVPR, pages 4700–4708, 2017.
  12. Multi-source multi-scale counting in extremely dense crowd images. In CVPR, pages 2547–2554, 2013.
  13. Composition loss for counting, density map estimation and localization in dense crowds. In ECCV, pages 532–546, 2018.
  14. Clip-count: Towards text-guided zero-shot object counting. arXiv preprint arXiv:2305.07304, 2023.
  15. Crowd counting by adaptively fusing predictions from an image pyramid. In BMVC, page 89, 2018.
  16. Segment anything. In ICCV, pages 4015–4026, 2023.
  17. Calibrating uncertainty for semi-supervised crowd counting. In ICCV, pages 16731–16741, 2023.
  18. Csrnet: Dilated convolutional neural networks for understanding the highly congested scenes. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1091–1100, 2018.
  19. An end-to-end transformer model for crowd localization. In ECCV, pages 38–54. Springer, 2022.
  20. Crowdclip: Unsupervised crowd counting via vision-language model. In CVPR, pages 2893–2903, 2023.
  21. Boosting crowd counting via multifaceted attention. In CVPR, pages 19628–19637, 2022.
  22. Optimal transport minimization: Crowd localization on density maps for semi-supervised counting. In CVPR, pages 21663–21673, 2023.
  23. Point-query quadtree for crowd counting, localization, and more. In ICCV, pages 1676–1685, 2023.
  24. Bayesian loss for crowd count estimation with point supervision. In ICCV, pages 6142–6151, 2019.
  25. Towards a universal model for cross-dataset crowd counting. In ICCV, pages 3205–3214, 2021a.
  26. Learning to count via unbalanced optimal transport. In AAAI, pages 2319–2327, 2021b.
  27. Spatial uncertainty-aware semi-supervised crowd counting. In ICCV, pages 15549–15559, 2021.
  28. Learning to count everything. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3394–3403, 2021.
  29. Locate, size, and count: accurately resolving people in dense crowds via detection. IEEE TPAMI, 43(8):2739–2751, 2020.
  30. Crowd counting in the frequency domain. In CVPR, pages 19618–19627, 2022.
  31. Generating high-quality crowd density maps using contextual pyramid cnns. In ICCV, pages 1861–1870, 2017.
  32. Jhu-crowd++: Large-scale crowd counting dataset and a benchmark method. IEEE TPAMI, 44(5):2594–2609, 2020.
  33. Rethinking counting and localization in crowds: A purely point-based framework. In ICCV, pages 3365–3374, 2021.
  34. Adaptive density map generation for crowd counting. In ICCV, pages 1130–1139, 2019.
  35. Modeling noisy annotations for crowd counting. NeurIPS, 33:3386–3396, 2020.
  36. Residual regression with semantic prior for crowd counting. In CVPR, pages 4036–4045, 2019.
  37. Kernel-based density map generation for dense object counting. IEEE TPAMI, 44(3):1357–1370, 2020.
  38. A generalized loss function for crowd counting and localization. In CVPR, pages 1974–1983, 2021.
  39. Modeling noisy annotations for point-wise supervision. IEEE TPAMI, 45(12):15065–15080, 2023.
  40. Distribution matching for crowd counting. NeurIPS, 33:1595–1607, 2020.
  41. Learning from synthetic data for crowd counting in the wild. In CVPR, pages 8198–8207, 2019.
  42. Semi-supervised crowd counting via multiple representation learning. IEEE TIP, 32:5220–5230, 2023.
  43. Dynamic momentum adaptation for zero-shot cross-domain crowd counting. In ACM MM, pages 658–666, 2021.
  44. Spatiotemporal modeling for crowd counting in videos. In ICCV, pages 5151–5159, 2017.
  45. Crowd counting with partial annotations in an image. In ICCV, pages 15570–15579, 2021.
  46. Cross-scene crowd counting via deep convolutional neural networks. In CVPR, pages 833–841, 2015.
  47. Single-image crowd counting via multi-column convolutional neural network. In CVPR, pages 589–597, 2016.
  48. Segment everything everywhere all at once. arXiv preprint arXiv:2304.06718, 2023.

Summary

We haven't generated a summary for this paper yet.