Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Proposing Plausible Answers for Open-ended Visual Question Answering (1610.06620v2)

Published 20 Oct 2016 in cs.CL, cs.AI, and cs.CV

Abstract: Answering open-ended questions is an essential capability for any intelligent agent. One of the most interesting recent open-ended question answering challenges is Visual Question Answering (VQA) which attempts to evaluate a system's visual understanding through its answers to natural language questions about images. There exist many approaches to VQA, the majority of which do not exhibit deeper semantic understanding of the candidate answers they produce. We study the importance of generating plausible answers to a given question by introducing the novel task of `Answer Proposal': for a given open-ended question, a system should generate a ranked list of candidate answers informed by the semantics of the question. We experiment with various models including a neural generative model as well as a semantic graph matching one. We provide both intrinsic and extrinsic evaluations for the task of Answer Proposal, showing that our best model learns to propose plausible answers with a high recall and performs competitively with some other solutions to VQA.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Omid Bakhshandeh (1 paper)
  2. Trung Bui (79 papers)
  3. Zhe Lin (163 papers)
  4. Walter Chang (21 papers)
Citations (1)