Distribution Matching for Crowd Counting (2009.13077v2)

Published 28 Sep 2020 in cs.CV

Abstract: In crowd counting, each training image contains multiple people, where each person is annotated by a dot. Existing crowd counting methods need to use a Gaussian to smooth each annotated dot or to estimate the likelihood of every pixel given the annotated point. In this paper, we show that imposing Gaussians to annotations hurts generalization performance. Instead, we propose to use Distribution Matching for crowd COUNTing (DM-Count). In DM-Count, we use Optimal Transport (OT) to measure the similarity between the normalized predicted density map and the normalized ground truth density map. To stabilize OT computation, we include a Total Variation loss in our model. We show that the generalization error bound of DM-Count is tighter than that of the Gaussian smoothed methods. In terms of Mean Absolute Error, DM-Count outperforms the previous state-of-the-art methods by a large margin on two large-scale counting datasets, UCF-QNRF and NWPU, and achieves the state-of-the-art results on the ShanghaiTech and UCF-CC50 datasets. DM-Count reduced the error of the state-of-the-art published result by approximately 16%. Code is available at https://github.com/cvlab-stonybrook/DM-Count.

Authors (4)

Boyu Wang (72 papers)
Huidong Liu (13 papers)
Dimitris Samaras (125 papers)
Minh Hoai (48 papers)

Citations (259)

View on Semantic Scholar

Summary

The paper introduces DM-Count, which replaces traditional Gaussian smoothing with Optimal Transport to align predicted and true density maps.
It integrates Total Variation loss with OT loss to enhance stability and reduce error, achieving up to a 16% MAE reduction on datasets like NWPU.
Empirical results on four major datasets confirm that DM-Count outperforms state-of-the-art methods, promising improvements in real-world crowd monitoring.

Distribution Matching for Crowd Counting: A Formal Overview

The paper "Distribution Matching for Crowd Counting" addresses a significant limitation in the current methodologies applied to the problem of crowd density estimation from images. Traditional approaches heavily rely on Gaussian kernels to smooth annotations in crowd datasets, potentially hampering the generalization capabilities of these models. The authors propose a novel approach, namely DM-Count, which leverages Optimal Transport (OT) theory to measure the discrepancy between predicted and ground truth density maps without the need for smoothing annotated dots.

Key Contributions

The paper introduces several notable contributions to the field:

Analysis of Gaussian Smoothing: The authors provide both theoretical and empirical evidence indicating the adverse effects of Gaussian smoothing on the model's generalization performance. They argue that the use of Gaussian kernels introduces unnecessary bias and complicates the estimation process.
Distribution Matching using Optimal Transport: DM-Count employs OT to evaluate the similarity between the normalized density maps produced by the model and the ground truth maps. This method circumvents many of the limitations associated with Gaussian kernel-based approaches.
Incorporation of Total Variation Loss: To ensure robustness in OT computation and enhance stability, the approach integrates a Total Variation (TV) loss, complementing the OT loss. This dual-mechanism aids in capturing spatial discrepancies effectively, especially in densely populated regions.
Stronger Generalization Bounds: The proposed methodology boasts tighter generalization error bounds than Gaussian smoothed methods, highlighting DM-Count's robustness in diverse and variable crowd scenarios.
Empirical Validation: On four widely used crowd counting datasets (UCF-QNRF, NWPU, ShanghaiTech, UCF-CC50), DM-Count consistently outperforms existing state-of-the-art methodologies. It reduces the Mean Absolute Error (MAE) on large datasets like NWPU by approximately 16%, showcasing its practical effectiveness.

Numerical Results and Implications

The authors present strong numerical outcomes. DM-Count achieves a significant reduction in MAE across multiple datasets, attesting to its efficacy in accurately counting and localizing individuals in both sparse and dense crowd scenarios. The method excels in handling large-scale test datasets, evidenced by its superior performance on the challenging NWPU dataset.

These improvements have pragmatic implications: more reliable crowd estimates can enhance public safety initiatives and resource allocation during mass events. Among theoretical implications, the refined error bounds provide a sound basis for further academic exploration into non-Gaussian estimation spaces using OT.

Future Directions

Building upon the promising results of this paper, future research could explore adaptation of DM-Count for real-time applications, potentially enhancing video surveillance systems with instantaneous crowd density predictions. Moreover, investigating the applicability of this model to related domains such as cell or livestock counting presents an intriguing avenue for multidisciplinary impact.

Conclusion

This paper lays down a robust framework for advancing crowd counting techniques by integrating mathematical rigor from the domain of Optimal Transport. It marks a departure from traditional smoothing paradigms, suggesting a more principled approach aligned with the intrinsic characteristics of the data. The model's stability and precision offer a substantial contribution to the field, paving the way for more adaptable and accurate crowd density analytics.

PDF Markdown

Related Papers

GitHub

GitHub - cvlab-stonybrook/DM-Count: Code for NeurIPS 2020 paper: Distribution Matching for Crowd Counting. (217 stars)