Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Multimodal Multi-Hop Question Answering Through a Conversation Between Tools and Efficiently Finetuned Large Language Models (2309.08922v1)

Published 16 Sep 2023 in cs.CL

Abstract: We employ a tool-interacting divide-and-conquer strategy enabling LLMs to answer complex multimodal multi-hop questions. In particular, we harness the power of LLMs to divide a given multimodal multi-hop question into unimodal single-hop sub-questions to be answered by the appropriate tool from a predefined set of tools. After all corresponding tools provide the LLM with their answers, the LLM generates the next relevant unimodal single-hop question. To increase the reasoning ability of LLMs, we prompt chatGPT to generate a tool-interacting divide-and-conquer dataset. This dataset is then used to efficiently finetune the corresponding LLM. To assess the effectiveness of this approach, we conduct an evaluation on two recently introduced complex question-answering datasets. The experimental analysis demonstrate substantial improvements over existing state-of-the-art solutions, indicating the efficacy and generality of our strategy

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Hossein Rajabzadeh (8 papers)
  2. Suyuchen Wang (16 papers)
  3. Hyock Ju Kwon (5 papers)
  4. Bang Liu (93 papers)
Citations (2)