NuScenes-MQA: Integrated Evaluation of Captions and QA for Autonomous Driving Datasets using Markup Annotations (2312.06352v1)

Published 11 Dec 2023 in cs.CV and cs.CL

Abstract: Visual Question Answering (VQA) is one of the most important tasks in autonomous driving, which requires accurate recognition and complex situation evaluations. However, datasets annotated in a QA format, which guarantees precise language generation and scene recognition from driving scenes, have not been established yet. In this work, we introduce Markup-QA, a novel dataset annotation technique in which QAs are enclosed within markups. This approach facilitates the simultaneous evaluation of a model's capabilities in sentence generation and VQA. Moreover, using this annotation methodology, we designed the NuScenes-MQA dataset. This dataset empowers the development of vision LLMs, especially for autonomous driving tasks, by focusing on both descriptive capabilities and precise QA. The dataset is available at https://github.com/turingmotors/NuScenes-MQA.

PDF HTML Abstract

Summarize Bookmark Chat (Pro)

References (44)

Authors (4)

Yuichi Inoue (5 papers)
Yuki Yada (4 papers)
Kotaro Tanahashi (9 papers)
Yu Yamaguchi (8 papers)

Citations (14)

View on Semantic Scholar

GitHub

GitHub - turingmotors/NuScenes-MQA: Official repository for the NuScenes-MQA. This paper is accepted by LLVA-AD Workshop at WACV 2024. (24 stars)

NuScenes-MQA: Integrated Evaluation of Captions and QA for Autonomous Driving Datasets using Markup Annotations (2312.06352v1)

Related Papers

GitHub