Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Knowledge Mining with Scene Text for Fine-Grained Recognition (2203.14215v1)

Published 27 Mar 2022 in cs.CV

Abstract: Recently, the semantics of scene text has been proven to be essential in fine-grained image classification. However, the existing methods mainly exploit the literal meaning of scene text for fine-grained recognition, which might be irrelevant when it is not significantly related to objects/scenes. We propose an end-to-end trainable network that mines implicit contextual knowledge behind scene text image and enhance the semantics and correlation to fine-tune the image representation. Unlike the existing methods, our model integrates three modalities: visual feature extraction, text semantics extraction, and correlating background knowledge to fine-grained image classification. Specifically, we employ KnowBert to retrieve relevant knowledge for semantic representation and combine it with image features for fine-grained classification. Experiments on two benchmark datasets, Con-Text, and Drink Bottle, show that our method outperforms the state-of-the-art by 3.72\% mAP and 5.39\% mAP, respectively. To further validate the effectiveness of the proposed method, we create a new dataset on crowd activity recognition for the evaluation. The source code and new dataset of this work are available at https://github.com/lanfeng4659/KnowledgeMiningWithSceneText.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Hao Wang (1120 papers)
  2. Junchao Liao (3 papers)
  3. Tianheng Cheng (31 papers)
  4. Zewen Gao (1 paper)
  5. Hao Liu (497 papers)
  6. Bo Ren (60 papers)
  7. Xiang Bai (222 papers)
  8. Wenyu Liu (146 papers)
Citations (12)
Github Logo Streamline Icon: https://streamlinehq.com