Counting with Focus for Free (1903.12206v2)

Published 28 Mar 2019 in cs.CV

Abstract: This paper aims to count arbitrary objects in images. The leading counting approaches start from point annotations per object from which they construct density maps. Then, their training objective transforms input images to density maps through deep convolutional networks. We posit that the point annotations serve more supervision purposes than just constructing density maps. We introduce ways to repurpose the points for free. First, we propose supervised focus from segmentation, where points are converted into binary maps. The binary maps are combined with a network branch and accompanying loss function to focus on areas of interest. Second, we propose supervised focus from global density, where the ratio of point annotations to image pixels is used in another branch to regularize the overall density estimation. To assist both the density estimation and the focus from segmentation, we also introduce an improved kernel size estimator for the point annotations. Experiments on six datasets show that all our contributions reduce the counting error, regardless of the base network, resulting in state-of-the-art accuracy using only a single network. Finally, we are the first to count on WIDER FACE, allowing us to show the benefits of our approach in handling varying object scales and crowding levels. Code is available at https://github.com/shizenglin/Counting-with-Focus-for-Free

Authors (3)

Zenglin Shi (20 papers)
Pascal Mettes (52 papers)
Cees G. M. Snoek (134 papers)

Citations (102)

View on Semantic Scholar

Summary

The paper introduces the 'focus for free' method that leverages point annotations as dual supervisory signals to enhance object counting performance.
It converts point annotations into binary segmentation maps and uses pixel ratio cues for global density regularization, refining focus within images.
Experimental results on datasets like ShanghaiTech and UCF-QNRF demonstrate reduced counting errors and state-of-the-art performance.

An Expert Analysis of "Counting with Focus for Free"

The paper "Counting with Focus for Free" by Zenglin Shi, Pascal Mettes, and Cees G. M. Snoek presents a sophisticated approach to object counting in images, leveraging convolutional neural networks (CNNs). The authors postulate that point annotations, extensively used in leading edge density-based counting methods, can be further exploited to provide additional supervisory signals without extra labelling costs. This paper contributes novel techniques, notably effective in enhancing counting accuracy by incorporating supervised attention mechanisms derived from existing point annotations.

Summary and Contributions

The core advancement posed by this paper is the "focus for free" methodology, wherein the traditionally used point annotations extend beyond their primary function of creating density maps. This dual-utilization is achieved in two distinct manners:

Focus from Segmentation: The authors propose converting point annotations into binary segmentation maps. This transformation allows for the training of an auxiliary network to specifically learn and emphasize regions of interest within an image. The segmentation maps steer the network focus towards significant areas, hence better approximating object counts.
Focus from Global Density: Another innovative supervision signal is the ratio of point annotations to image pixels. Employing this ratio in another network branch facilitates global density regularization, improving overall density map estimation.

Additionally, the authors introduce an improved non-uniform kernel size estimator for point annotations, accommodating images with varied object density, which is crucial when dealing with non-uniform distributions typical in real-world data.

Experimental Results

Empirical evaluations were conducted across six datasets, including the newly addressed WIDER FACE, where counting amidst varying scales and crowding levels was specially underscored. In all evaluated scenarios, the proposed methods consistently reduced the counting error, surpassing the current state-of-the-art benchmarks.

Of particular note is the network's ability to achieve state-of-the-art performance using a single network framework, demonstrating significant improvements in datasets like ShanghaiTech and UCF-QNRF. This suggests a robust and scalable counting system that can be generalized regardless of the base architecture.

Implications and Future Research

From a theoretic perspective, this work indicates a paradigm shift in how supplementary supervisory signals can be extracted from existing annotations, pushing the boundaries of conventional network training. Practically, the research implies substantial savings in annotation time and expense while enhancing model performance — a concrete advantage for deploying AI solutions in resource-constrained environments.

Moving forward, investigating the extension of similar methodologies to other domains, such as semi-supervised or unsupervised settings, presents an intriguing avenue for forthcoming research. The success of leveraging task-specific annotations for auxiliary supervision could inspire analogous advancements across various computer vision tasks, such as segmentation, detection, and beyond.

In summary, the paper "Counting with Focus for Free" introduces a pertinent, cost-effective strategy in image counting, which could establish a foundation for future innovations in the field, given its potential for improved accuracy and operational efficiency.

PDF Markdown

Related Papers

GitHub

GitHub - shizenglin/Counting-with-Focus-for-Free: Code for Counting with Focus for Free, ICCV, 2019 (90 stars)