Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

AD-H: Autonomous Driving with Hierarchical Agents (2406.03474v1)

Published 5 Jun 2024 in cs.CV

Abstract: Due to the impressive capabilities of multimodal LLMs (MLLMs), recent works have focused on employing MLLM-based agents for autonomous driving in large-scale and dynamic environments. However, prevalent approaches often directly translate high-level instructions into low-level vehicle control signals, which deviates from the inherent language generation paradigm of MLLMs and fails to fully harness their emergent powers. As a result, the generalizability of these methods is highly restricted by autonomous driving datasets used during fine-tuning. To tackle this challenge, we propose to connect high-level instructions and low-level control signals with mid-level language-driven commands, which are more fine-grained than high-level instructions but more universal and explainable than control signals, and thus can effectively bridge the gap in between. We implement this idea through a hierarchical multi-agent driving system named AD-H, including a MLLM planner for high-level reasoning and a lightweight controller for low-level execution. The hierarchical design liberates the MLLM from low-level control signal decoding and therefore fully releases their emergent capability in high-level perception, reasoning, and planning. We build a new dataset with action hierarchy annotations. Comprehensive closed-loop evaluations demonstrate several key advantages of our proposed AD-H system. First, AD-H can notably outperform state-of-the-art methods in achieving exceptional driving performance, even exhibiting self-correction capabilities during vehicle operation, a scenario not encountered in the training dataset. Second, AD-H demonstrates superior generalization under long-horizon instructions and novel environmental conditions, significantly surpassing current state-of-the-art methods. We will make our data and code publicly accessible at https://github.com/zhangzaibin/AD-H

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (10)
  1. Zaibin Zhang (6 papers)
  2. Shiyu Tang (15 papers)
  3. Yuanhang Zhang (35 papers)
  4. Talas Fu (1 paper)
  5. Yifan Wang (319 papers)
  6. Yang Liu (2253 papers)
  7. Dong Wang (628 papers)
  8. Jing Shao (109 papers)
  9. Lijun Wang (51 papers)
  10. Huchuan Lu (199 papers)
Citations (1)
Github Logo Streamline Icon: https://streamlinehq.com

GitHub