Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Visual Question Answering with Prior Class Semantics (2005.01239v1)

Published 4 May 2020 in cs.CV and cs.LG

Abstract: We present a novel mechanism to embed prior knowledge in a model for visual question answering. The open-set nature of the task is at odds with the ubiquitous approach of training of a fixed classifier. We show how to exploit additional information pertaining to the semantics of candidate answers. We extend the answer prediction process with a regression objective in a semantic space, in which we project candidate answers using prior knowledge derived from word embeddings. We perform an extensive study of learned representations with the GQA dataset, revealing that important semantic information is captured in the relations between embeddings in the answer space. Our method brings improvements in consistency and accuracy over a range of question types. Experiments with novel answers, unseen during training, indicate the method's potential for open-set prediction.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Violetta Shevchenko (6 papers)
  2. Damien Teney (43 papers)
  3. Anthony Dick (24 papers)
  4. Anton van den Hengel (188 papers)
Citations (7)

Summary

We haven't generated a summary for this paper yet.