Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Drive Like a Human: Rethinking Autonomous Driving with Large Language Models (2307.07162v1)

Published 14 Jul 2023 in cs.RO and cs.CL

Abstract: In this paper, we explore the potential of using a LLM to understand the driving environment in a human-like manner and analyze its ability to reason, interpret, and memorize when facing complex scenarios. We argue that traditional optimization-based and modular autonomous driving (AD) systems face inherent performance limitations when dealing with long-tail corner cases. To address this problem, we propose that an ideal AD system should drive like a human, accumulating experience through continuous driving and using common sense to solve problems. To achieve this goal, we identify three key abilities necessary for an AD system: reasoning, interpretation, and memorization. We demonstrate the feasibility of employing an LLM in driving scenarios by building a closed-loop system to showcase its comprehension and environment-interaction abilities. Our extensive experiments show that the LLM exhibits the impressive ability to reason and solve long-tailed cases, providing valuable insights for the development of human-like autonomous driving. The related code are available at https://github.com/PJLab-ADG/DriveLikeAHuman .

Rethinking Autonomous Driving with LLMs

The paper "Drive Like a Human: Rethinking Autonomous Driving with LLMs" explores a novel approach to autonomous driving systems by leveraging the capabilities of LLMs such as GPT-3.5. The authors propose a departure from traditional optimization-based and modular autonomous driving (AD) systems by suggesting that an ideal AD system should mimic human driving behaviors through reasoning, interpretation, and memorization.

Key Contributions

The paper identifies three critical abilities necessary for an AD system:

  1. Reasoning: The ability to employ common sense and experience to make informed decisions in various scenarios.
  2. Interpretation: The ability to introspect and interpret decisions, demonstrating an understanding of declarative memory.
  3. Memorization: The ability to retain experiences and apply them to future similar situations.

To demonstrate the feasibility of LLMs in driving scenarios, the researchers have constructed a closed-loop system to illustrate the comprehension and environment-interaction capabilities of LLMs. This closed-loop system showcases the impressive reasoning and problem-solving abilities of LLMs when faced with complex, long-tail scenarios.

Experiments and Findings

The experiments conducted in this paper reveal that LLMs can exhibit human-like reasoning by making decisions based on common sense. A notable demonstration involves LLMs making nuanced distinctions in driving scenarios, such as discerning whether traffic cones on a truck bed indicate a hazard. This showcases not only understanding but also a capacity for practical decision-making, which traditional systems struggle with due to their lack of common sense.

Furthermore, the closed-loop driving in HighwayEnv illustrates that LLMs, without extensive training, outperform RL-based and Search-based methods by achieving a zero-shot pass rate of over 60%. Unlike these conventional approaches, LLMs employ thoughtful decision-making by evaluating the potential consequences of their actions, exhibiting consistent decision-making processes.

The paper also emphasizes LLMs' memorization abilities, which are crucial for continuous learning. By retaining scenarios where the LLMs' decisions deviated from expert feedback, they can improve their decision-making processes in future analogous situations.

Implications and Future Directions

From a theoretical standpoint, adopting LLMs in AD systems could significantly shift the development paradigm, moving towards more human-like driving behaviors with improved handling of long-tail corner cases. This shift could alleviate the persistent issue of catastrophic forgetting in optimization-based methods.

Practically, the implications extend to improved safety and efficiency in autonomous driving systems as LLMs mature in their understanding of complex and unpredictable driving environments. From a computational perspective, leveraging LLMs could reduce the reliance on large volumes of specific driving data, as LLMs can generalize from broader experiential memories.

Looking forward, exploring the integration of multi-modal capabilities in LLMs could enhance their environmental interaction skills, further approximating the nuanced decision-making processes of human drivers. As LLM-based systems evolve, they may lay the groundwork for the next generation of AGI-driven autonomous vehicles.

In conclusion, this research presents a compelling case for the incorporation of LLMs in autonomous driving systems, paving the way for more resilient and human-like approaches to navigating the complexities of real-world driving environments.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Daocheng Fu (22 papers)
  2. Xin Li (980 papers)
  3. Licheng Wen (31 papers)
  4. Min Dou (22 papers)
  5. Pinlong Cai (28 papers)
  6. Botian Shi (56 papers)
  7. Yu Qiao (563 papers)
Citations (119)
Github Logo Streamline Icon: https://streamlinehq.com
X Twitter Logo Streamline Icon: https://streamlinehq.com