Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Multiple interaction learning with question-type prior knowledge for constraining answer search space in visual question answering (2009.11118v1)

Published 23 Sep 2020 in cs.CV

Abstract: Different approaches have been proposed to Visual Question Answering (VQA). However, few works are aware of the behaviors of varying joint modality methods over question type prior knowledge extracted from data in constraining answer search space, of which information gives a reliable cue to reason about answers for questions asked in input images. In this paper, we propose a novel VQA model that utilizes the question-type prior information to improve VQA by leveraging the multiple interactions between different joint modality methods based on their behaviors in answering questions from different types. The solid experiments on two benchmark datasets, i.e., VQA 2.0 and TDIUC, indicate that the proposed method yields the best performance with the most competitive approaches.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Tuong Do (20 papers)
  2. Binh X. Nguyen (9 papers)
  3. Huy Tran (30 papers)
  4. Erman Tjiputra (21 papers)
  5. Quang D. Tran (20 papers)
  6. Thanh-Toan Do (92 papers)
Citations (2)