Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Are Current Decoding Strategies Capable of Facing the Challenges of Visual Dialogue? (2210.12997v1)

Published 24 Oct 2022 in cs.CL and cs.CV

Abstract: Decoding strategies play a crucial role in natural language generation systems. They are usually designed and evaluated in open-ended text-only tasks, and it is not clear how different strategies handle the numerous challenges that goal-oriented multimodal systems face (such as grounding and informativeness). To answer this question, we compare a wide variety of different decoding strategies and hyper-parameter configurations in a Visual Dialogue referential game. Although none of them successfully balance lexical richness, accuracy in the task, and visual grounding, our in-depth analysis allows us to highlight the strengths and weaknesses of each decoding strategy. We believe our findings and suggestions may serve as a starting point for designing more effective decoding algorithms that handle the challenges of Visual Dialogue tasks.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Amit Kumar Chaudhary (1 paper)
  2. Alex J. Lucassen (1 paper)
  3. Ioanna Tsani (1 paper)
  4. Alberto Testoni (13 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.