Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
51 tokens/sec
GPT-4o
60 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

MMT-Bench: A Comprehensive Multimodal Benchmark for Evaluating Large Vision-Language Models Towards Multitask AGI (2404.16006v1)

Published 24 Apr 2024 in cs.CV

Abstract: Large Vision-LLMs (LVLMs) show significant strides in general-purpose multimodal applications such as visual dialogue and embodied navigation. However, existing multimodal evaluation benchmarks cover a limited number of multimodal tasks testing rudimentary capabilities, falling short in tracking LVLM development. In this study, we present MMT-Bench, a comprehensive benchmark designed to assess LVLMs across massive multimodal tasks requiring expert knowledge and deliberate visual recognition, localization, reasoning, and planning. MMT-Bench comprises $31,325$ meticulously curated multi-choice visual questions from various multimodal scenarios such as vehicle driving and embodied navigation, covering $32$ core meta-tasks and $162$ subtasks in multimodal understanding. Due to its extensive task coverage, MMT-Bench enables the evaluation of LVLMs using a task map, facilitating the discovery of in- and out-of-domain tasks. Evaluation results involving $30$ LVLMs such as the proprietary GPT-4V, GeminiProVision, and open-sourced InternVL-Chat, underscore the significant challenges posed by MMT-Bench. We anticipate that MMT-Bench will inspire the community to develop next-generation multimodal foundation models aimed at achieving general-purpose multimodal intelligence.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (22)
  1. Kaining Ying (5 papers)
  2. Fanqing Meng (14 papers)
  3. Jin Wang (356 papers)
  4. Zhiqian Li (2 papers)
  5. Han Lin (53 papers)
  6. Yue Yang (146 papers)
  7. Hao Zhang (947 papers)
  8. Wenbo Zhang (49 papers)
  9. Yuqi Lin (10 papers)
  10. Shuo Liu (123 papers)
  11. Jiayi Lei (7 papers)
  12. Quanfeng Lu (10 papers)
  13. Runjian Chen (20 papers)
  14. Peng Xu (357 papers)
  15. Renrui Zhang (100 papers)
  16. Haozhe Zhang (17 papers)
  17. Peng Gao (401 papers)
  18. Yali Wang (78 papers)
  19. Yu Qiao (563 papers)
  20. Ping Luo (340 papers)
Citations (51)