Few-shot Object Counting with Similarity-Aware Feature Enhancement (2201.08959v5)

Published 22 Jan 2022 in cs.CV

Abstract: This work studies the problem of few-shot object counting, which counts the number of exemplar objects (i.e., described by one or several support images) occurring in the query image. The major challenge lies in that the target objects can be densely packed in the query image, making it hard to recognize every single one. To tackle the obstacle, we propose a novel learning block, equipped with a similarity comparison module and a feature enhancement module. Concretely, given a support image and a query image, we first derive a score map by comparing their projected features at every spatial position. The score maps regarding all support images are collected together and normalized across both the exemplar dimension and the spatial dimensions, producing a reliable similarity map. We then enhance the query feature with the support features by employing the developed point-wise similarities as the weighting coefficients. Such a design encourages the model to inspect the query image by focusing more on the regions akin to the support images, leading to much clearer boundaries between different objects. Extensive experiments on various benchmarks and training setups suggest that we surpass the state-of-the-art methods by a sufficiently large margin. For instance, on a recent large-scale FSC-147 dataset, we surpass the state-of-the-art method by improving the mean absolute error from 22.08 to 14.32 (35%$\uparrow$). Code has been released in https://github.com/zhiyuanyou/SAFECount.

Citations (49)

View on Semantic Scholar

Summary

The paper introduces SAFECount, leveraging a Similarity Comparison Module and Feature Enhancement Module for effective few-shot object counting.
The methodology enhances query image features using a dynamic similarity map, enabling clear separation of densely packed objects for precise counting.
Extensive experiments on FSC-147 demonstrate a 35% reduction in mean absolute error, highlighting robust cross-dataset generalization and real-world applicability.

Few-shot Object Counting with Similarity-Aware Feature Enhancement

This paper addresses the task of few-shot object counting, a problem that involves determining the frequency of specified objects in an image by only providing a few support images of each object class. Unlike conventional object counting, which requires extensive training data of the specific target class, few-shot object counting allows for the processing of novel classes during the test stage without retraining, thereby significantly enhancing the generalization potential of the algorithms.

The authors propose a novel architecture termed SAFECount, which is built around a core component, the Similarity-Aware Feature Enhancement block. This block consists of two critical modules: the Similarity Comparison Module (SCM) and the Feature Enhancement Module (FEM). The SCM is responsible for generating a reliable similarity map that compares features of the support and query images. This is achieved through a process involving learnable feature projection, feature comparison using convolution, and normalization across various dimensions to ensure the scores appropriately represent the similarity. The resultant similarity map highlights regions in the query image that resemble the provided exemplar object.

The FEM takes advantage of the similarity map by using it as a weighting mechanism to enhance the features of the query image, effectively emphasizing the image regions that align with the support images. Through this method, the model can discern clearer boundaries between densely packed objects, which is a significant challenge in object counting tasks due to occlusion and dense arrangements. Regression of the density map from the enhanced feature map further facilitates precise object count predictions.

The extensive experimentation demonstrates that SAFECount achieves superior results compared to state-of-the-art methods. For instance, when evaluated on the large-scale FSC-147 dataset, the method achieved a reduction in mean absolute error from previous results of 22.08 to 14.32, marking a 35% improvement. This robust performance is attributed to the model's capacity to effectively leverage few-shot learning paradigms, handle densely packed objects, and exhibit strong cross-dataset generalization capabilities, as further evidenced by testing on datasets like CARPK.

Through this work, the authors provide a flexible framework that broadens the applicability of object counting systems. Specifically, it diminishes the constraints posed by needing extensive annotated datasets for new object classes. This advancement in few-shot learning could see broad applications across fields requiring adaptive computer vision solutions without extensive labeled data, a challenging but critical need in real-world scenarios such as wildlife monitoring, urban planning, and inventory management.

In looking toward future developments, the methodology highlights the profound potential of integrating few-shot learning techniques into various AI-driven domains. The fusion of detailed similarity mappings with traditional feature-based learning approaches may prompt further exploration into enhancing object recognition and counting tasks, thereby extending the practical applicability of AI solutions across diverse and dynamic environments.

Related Papers

GitHub

GitHub - zhiyuanyou/SAFECount: [WACV 2023] Few-shot Object Counting with Similarity-Aware Feature Enhancement (124 stars)