Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Generative Visual Dialogue System via Adaptive Reasoning and Weighted Likelihood Estimation (1902.09818v2)

Published 26 Feb 2019 in cs.CV

Abstract: The key challenge of generative Visual Dialogue (VD) systems is to respond to human queries with informative answers in natural and contiguous conversation flow. Traditional Maximum Likelihood Estimation (MLE)-based methods only learn from positive responses but ignore the negative responses, and consequently tend to yield safe or generic responses. To address this issue, we propose a novel training scheme in conjunction with weighted likelihood estimation (WLE) method. Furthermore, an adaptive multi-modal reasoning module is designed, to accommodate various dialogue scenarios automatically and select relevant information accordingly. The experimental results on the VisDial benchmark demonstrate the superiority of our proposed algorithm over other state-of-the-art approaches, with an improvement of 5.81% on recall@10.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Heming Zhang (13 papers)
  2. Shalini Ghosh (34 papers)
  3. Larry Heck (41 papers)
  4. Stephen Walsh (3 papers)
  5. Junting Zhang (11 papers)
  6. Jie Zhang (846 papers)
  7. C. -C. Jay Kuo (176 papers)
Citations (7)