Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

NOC-REK: Novel Object Captioning with Retrieved Vocabulary from External Knowledge (2203.14499v1)

Published 28 Mar 2022 in cs.CV

Abstract: Novel object captioning aims at describing objects absent from training data, with the key ingredient being the provision of object vocabulary to the model. Although existing methods heavily rely on an object detection model, we view the detection step as vocabulary retrieval from an external knowledge in the form of embeddings for any object's definition from Wiktionary, where we use in the retrieval image region features learned from a transformers model. We propose an end-to-end Novel Object Captioning with Retrieved vocabulary from External Knowledge method (NOC-REK), which simultaneously learns vocabulary retrieval and caption generation, successfully describing novel objects outside of the training dataset. Furthermore, our model eliminates the requirement for model retraining by simply updating the external knowledge whenever a novel object appears. Our comprehensive experiments on held-out COCO and Nocaps datasets show that our NOC-REK is considerably effective against SOTAs.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Duc Minh Vo (16 papers)
  2. Hong Chen (230 papers)
  3. Akihiro Sugimoto (21 papers)
  4. Hideki Nakayama (59 papers)
Citations (11)

Summary

We haven't generated a summary for this paper yet.