Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
126 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Few-shot Fine-grained Image Classification via Multi-Frequency Neighborhood and Double-cross Modulation (2207.08547v2)

Published 18 Jul 2022 in cs.CV

Abstract: Traditional fine-grained image classification typically relies on large-scale training samples with annotated ground-truth. However, some sub-categories have few available samples in real-world applications, and current few-shot models still have difficulty in distinguishing subtle differences among fine-grained categories. To solve this challenge, we propose a novel few-shot fine-grained image classification network (FicNet) using multi-frequency neighborhood (MFN) and double-cross modulation (DCM). MFN focuses on both spatial domain and frequency domain to capture multi-frequency structural representations, which reduces the influence of appearance and background changes to the intra-class distance. DCM consists of bi-crisscross component and double 3D cross-attention component. It modulates the representations by considering global context information and inter-class relationship respectively, which enables the support and query samples respond to the same parts and accurately identify the subtle inter-class differences. The comprehensive experiments on three fine-grained benchmark datasets for two few-shot tasks verify that FicNet has excellent performance compared to the state-of-the-art methods. Especially, the experiments on two datasets, "Caltech-UCSD Birds" and "Stanford Cars", can obtain classification accuracy 93.17\% and 95.36\%, respectively. They are even higher than that the general fine-grained image classification methods can achieve.

Citations (1)

Summary

We haven't generated a summary for this paper yet.