JHU-CROWD++: Large-Scale Crowd Counting Dataset and A Benchmark Method (2004.03597v2)

Published 7 Apr 2020 in cs.CV

Abstract: Due to its variety of applications in the real-world, the task of single image-based crowd counting has received a lot of interest in the recent years. Recently, several approaches have been proposed to address various problems encountered in crowd counting. These approaches are essentially based on convolutional neural networks that require large amounts of data to train the network parameters. Considering this, we introduce a new large scale unconstrained crowd counting dataset (JHU-CROWD++) that contains "4,372" images with "1.51 million" annotations. In comparison to existing datasets, the proposed dataset is collected under a variety of diverse scenarios and environmental conditions. Specifically, the dataset includes several images with weather-based degradations and illumination variations, making it a very challenging dataset. Additionally, the dataset consists of a rich set of annotations at both image-level and head-level. Several recent methods are evaluated and compared on this dataset. The dataset can be downloaded from http://www.crowd-counting.com . Furthermore, we propose a novel crowd counting network that progressively generates crowd density maps via residual error estimation. The proposed method uses VGG16 as the backbone network and employs density map generated by the final layer as a coarse prediction to refine and generate finer density maps in a progressive fashion using residual learning. Additionally, the residual learning is guided by an uncertainty-based confidence weighting mechanism that permits the flow of only high-confidence residuals in the refinement path. The proposed Confidence Guided Deep Residual Counting Network (CG-DRCN) is evaluated on recent complex datasets, and it achieves significant improvements in errors.

Authors (3)

Vishwanath A. Sindagi (21 papers)
Rajeev Yasarla (27 papers)
Vishal M. Patel (230 papers)

Citations (201)

View on Semantic Scholar

Summary

Overview of JHU-CROWD++: Large-Scale Crowd Counting Dataset and Benchmark Method

This paper presents JHU-CROWD++, a large-scale dataset for crowd counting, alongside a benchmark method, namely the Confidence Guided Deep Residual Crowd Counting Network (CG-DRCN). The research is motivated by existing datasets' limitations, such as insufficient training samples, absence of adverse conditions like weather-related degradations, dataset bias, and limited annotations. These limitations affect the development of robust crowd counting models necessary for real-world applications. By addressing these concerns, JHU-CROWD++ provides a comprehensive framework to advance the current state of crowd counting research.

Dataset Characteristics and Novel Contributions

JHU-CROWD++ is a collection of 4,372 images with over 1.5 million annotations, which accommodate a diverse array of crowd scenarios including different weather conditions. These annotations cover not only head locations but extend to attributes like occlusion levels, blur levels, and size information, enhancing the potential for model training and performance.

Significantly, JHU-CROWD++ introduces images collected under various weather conditions (rain, snow, fog) and distractor images to mitigate dataset bias. These features enable the development of models that can generalize better to real-world conditions.

Benchmark Method: Confidence Guided Deep Residual Counting Network (CG-DRCN)

The proposed CG-DRCN employs a residual learning framework that uses a backbone based on VGG16 architecture. It integrates uncertainty-based confidence weighting mechanisms to guide a progressive refinement of crowd density maps. This approach adopts a multi-scale strategy, progressively refining density maps using residuals estimated at different layers of the network. Key elements of the CG-DRCN include:

Residual Learning: Employs residual maps using convolutional blocks to refine the density maps at different resolutions, focusing on local errors.
Uncertainty Guidance: Implements a confidence estimation module to control residual information flow, ensuring that only high-confidence residuals influence the refinement process.
Class-Conditioning: Utilizes image-level labels such as weather conditions to condition the residual estimation, enhancing performance particularly under adverse conditions.

Overall, this method significantly optimizes the computation of density maps, yielding improved accuracy in crowd counting across challenging datasets.

Experimental Results and Impact

The CG-DRCN method demonstrates its efficacy on the JHU-CROWD++ dataset, reducing counting error substantially compared to existing methods. Benchmark comparisons reveal that CG-DRCN, particularly with the Res101 backbone, achieves the lowest overall mean absolute error (MAE) and mean squared error (MSE) on both validation and test sets of JHU-CROWD++. Importantly, these results position CG-DRCN as a competitive approach for contemporary crowd counting challenges.

Implications and Future Directions

JHU-CROWD++ and CG-DRCN collectively push the boundaries of crowd counting research through comprehensive dataset attributes and innovative residual counting networks. The dataset's breadth and depth, including detailed annotations and diverse conditions, provide fertile ground for developing new architectures and technologies in crowd analytics.

Future work may explore leveraging even more sophisticated backbone networks, enhancing class-conditioning strategies, and possibly integrating real-time analytics into crowd surveillance systems. Additionally, expanding the dataset to include other scenarios or environments could further facilitate the generalization and robustness of crowd counting models.

In conclusion, the paper’s contributions significantly advance current methodologies and resource availability for crowd counting, offering valuable insights and tools for future explorations in the domain.

PDF Markdown