Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
167 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

SurrealDriver: Designing LLM-powered Generative Driver Agent Framework based on Human Drivers' Driving-thinking Data (2309.13193v2)

Published 22 Sep 2023 in cs.HC

Abstract: Leveraging advanced reasoning capabilities and extensive world knowledge of LLMs to construct generative agents for solving complex real-world problems is a major trend. However, LLMs inherently lack embodiment as humans, resulting in suboptimal performance in many embodied decision-making tasks. In this paper, we introduce a framework for building human-like generative driving agents using post-driving self-report driving-thinking data from human drivers as both demonstration and feedback. To capture high-quality, natural language data from drivers, we conducted urban driving experiments, recording drivers' verbalized thoughts under various conditions to serve as chain-of-thought prompts and demonstration examples for the LLM-Agent. The framework's effectiveness was evaluated through simulations and human assessments. Results indicate that incorporating expert demonstration data significantly reduced collision rates by 81.04\% and increased human likeness by 50\% compared to a baseline LLM-based agent. Our study provides insights into using natural language-based human demonstration data for embodied tasks. The driving-thinking dataset is available at \url{https://github.com/AIR-DISCOVER/Driving-Thinking-Dataset}.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (44)
  1. A hybrid rule-based and data-driven approach to driver modeling through particle filtering. IEEE Transactions on Intelligent Transportation Systems 23, 8 (2021), 13055–13068.
  2. Mimicking human driving behaviour for realistic simulation of traffic flow. International Journal of Simulation and Process Modelling 6, 2 (2010), 126–136.
  3. AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors in Agents. arXiv:2308.10848 [cs.CL]
  4. FurChat: An Embodied Conversational Agent using LLMs, Combining Open and Closed-Domain Dialogue with Facial Expressions. arXiv:2308.15214 [cs.CL]
  5. Exploring the limitations of behavior cloning for autonomous driving. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 9329–9338.
  6. An Appraisal-Based Chain-Of-Emotion Architecture for Affective Language Model Game Agents. arXiv:2309.05076 [cs.AI]
  7. LLM Powered Sim-to-real Transfer for Traffic Signal Control. arXiv:2308.14284 [cs.AI]
  8. A cognitive based driver’s steering behavior modeling. In 2016 4th International Conference on Control, Instrumentation, and Automation (ICCIA). IEEE, 390–395.
  9. Ernst Dieter Dickmanns and Volker Graefe. 1988. Dynamic monocular machine vision. Machine vision and applications 1, 4 (1988), 223–240.
  10. CARLA: An open urban driving simulator. In Conference on robot learning. PMLR, 1–16.
  11. Palm-e: An embodied multimodal language model. arXiv preprint arXiv:2303.03378 (2023).
  12. Style2Fab: Functionality-Aware Segmentation for Fabricating Personalized 3D Models with Generative AI. arXiv:2309.06379 [cs.HC]
  13. Drive Like a Human: Rethinking Autonomous Driving with Large Language Models. arXiv preprint arXiv:2307.07162 (2023).
  14. ChatDB: Augmenting LLMs with Databases as Their Symbolic Memory. arXiv:2306.03901 [cs.AI]
  15. CGMI: Configurable General Multi-Agent Interaction Framework. arXiv:2308.12503 [cs.AI]
  16. Nidhi Kalra and Susan M Paddock. 2016. Driving to safety: How many miles of driving would it take to demonstrate autonomous vehicle reliability? Transportation Research Part A: Policy and Practice 94 (2016), 182–193.
  17. A survey on simulators for testing self-driving cars. In 2021 Fourth International Conference on Connected and Autonomous Driving (MetroCAD). IEEE, 62–70.
  18. A study on an enhanced autonomous driving simulation model based on reinforcement learning using a collision prevention model. Electronics 10, 18 (2021), 2271.
  19. Real-time motion planning with applications to autonomous urban driving. IEEE Transactions on control systems technology 17, 5 (2009), 1105–1118.
  20. Camel: Communicative agents for” mind” exploration of large scale language model society. arXiv preprint arXiv:2303.17760 (2023).
  21. TradingGPT: Multi-Agent System with Layered Memory and Distinct Characters for Enhanced Financial Trading Performance. arXiv:2309.03736 [q-fin.PM]
  22. SwiftSage: A Generative Agent with Fast and Slow Thinking for Complex Interactive Tasks. arXiv:2305.17390 [cs.CL]
  23. Few-shot parameter-efficient fine-tuning is better and cheaper than in-context learning. Advances in Neural Information Processing Systems 35 (2022), 1950–1965.
  24. Evaluation of Pedestrian Safety in a High-Fidelity Simulation Environment Framework. In 2023 IEEE 26th International Conference on Intelligent Transportation Systems (ITSC). IEEE, 901–908.
  25. Rethinking the role of demonstrations: What makes in-context learning work? arXiv preprint arXiv:2202.12837 (2022).
  26. OpenAI. 2023. GPT-4 technical report. arXiv (2023), 2303–08774.
  27. Generative agents: Interactive simulacra of human behavior. arXiv preprint arXiv:2304.03442 (2023).
  28. Social Simulacra: Creating Populated Prototypes for Social Computing Systems. arXiv:2208.04024 [cs.HC]
  29. Weichao Qiu and Alan Yuille. 2016. Unrealcv: Connecting computer vision to unreal engine. In Computer Vision–ECCV 2016 Workshops: Amsterdam, The Netherlands, October 8-10 and 15-16, 2016, Proceedings, Part III 14. Springer, 909–916.
  30. Bernardino Romera-Paredes and Philip Torr. 2015. An embarrassingly simple approach to zero-shot learning. In International conference on machine learning. PMLR, 2152–2161.
  31. Learning to retrieve prompts for in-context learning. arXiv preprint arXiv:2112.08633 (2021).
  32. Reflexion: Language agents with verbal reinforcement learning. arXiv preprint arXiv:2303.11366 (2023).
  33. Behavior planning of autonomous cars with social perception. In 2019 IEEE Intelligent Vehicles Symposium (IV). IEEE, 207–213.
  34. Jorge Villagra and Antonio Artuñedo. 2023. Chapter 3 - Behavior planning. In Decision-Making Techniques for Autonomous Vehicles, Jorge Villagra and Felipe Jiménez (Eds.). Elsevier, 39–59. https://doi.org/10.1016/B978-0-323-98339-6.00010-5
  35. Voyager: An open-ended embodied agent with large language models. arXiv preprint arXiv:2305.16291 (2023).
  36. Self-instruct: Aligning language model with self generated instructions. arXiv preprint arXiv:2212.10560 (2022).
  37. Interactive natural language processing. arXiv preprint arXiv:2305.13246 (2023).
  38. Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35 (2022), 24824–24837.
  39. Unleashing the Power of Graph Learning through LLM-based Autonomous Agents. arXiv:2309.04565 [cs.LG]
  40. Analyzing the inconsistency in driving patterns between manual and autonomous modes under complex driving scenarios with a VR-enabled simulation platform. Journal of Intelligent and Connected Vehicles 5, 3 (2022), 215–234.
  41. React: Synergizing reasoning and acting in language models. arXiv preprint arXiv:2210.03629 (2022).
  42. A human-like game theory-based controller for automatic lane changing. Transportation Research Part C: Emerging Technologies 88 (2018), 140–158.
  43. ProAgent: Building Proactive Cooperative AI with Large Language Models. arXiv:2308.11339 [cs.AI]
  44. ExpeL: LLM Agents Are Experiential Learners. arXiv:2308.10144 [cs.LG]
Citations (1)

Summary

We haven't generated a summary for this paper yet.