Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
173 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Testing Large Language Models on Driving Theory Knowledge and Skills for Connected Autonomous Vehicles (2407.17211v1)

Published 24 Jul 2024 in cs.AI, cs.NI, and cs.RO

Abstract: Handling long tail corner cases is a major challenge faced by autonomous vehicles (AVs). While LLMs hold great potentials to handle the corner cases with excellent generalization and explanation capabilities and received increasing research interest on application to autonomous driving, there are still technical barriers to be tackled, such as strict model performance and huge computing resource requirements of LLMs. In this paper, we investigate a new approach of applying remote or edge LLMs to support autonomous driving. A key issue for such LLM assisted driving system is the assessment of LLMs on their understanding of driving theory and skills, ensuring they are qualified to undertake safety critical driving assistance tasks for CAVs. We design and run driving theory tests for several proprietary LLM models (OpenAI GPT models, Baidu Ernie and Ali QWen) and open-source LLM models (Tsinghua MiniCPM-2B and MiniCPM-Llama3-V2.5) with more than 500 multiple-choices theory test questions. Model accuracy, cost and processing latency are measured from the experiments. Experiment results show that while model GPT-4 passes the test with improved domain knowledge and Ernie has an accuracy of 85% (just below the 86% passing threshold), other LLM models including GPT-3.5 fail the test. For the test questions with images, the multimodal model GPT4-o has an excellent accuracy result of 96%, and the MiniCPM-Llama3-V2.5 achieves an accuracy of 76%. While GPT-4 holds stronger potential for CAV driving assistance applications, the cost of using model GPT4 is much higher, almost 50 times of that of using GPT3.5. The results can help make decision on the use of the existing LLMs for CAV applications and balancing on the model performance and cost.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (12)
  1. Z. Tang, J. He, K. Yang, and H. Chen, “6G cellular networks and connected autonomous vehicles”, IEEE Network, Vol.35, No.4, pp.255-261, July 2020.
  2. D. Wu, W. Han, T. Wang, Y. Liu, X. Zhang, J. Shen, “Language prompt for autonomous driving”, arXiv preprint arXiv:2309.04379, 2023.
  3. J Mao, Y Qian, H Zhao, Y Wang , “Gpt-driver: Learning to drive with gpt”, arXiv preprint arXiv:2310.01415, 2023.
  4. Z Xu, Y Zhang, E Xie, Z Zhao, Y Guo, KKY Wong, Z Li, H Zhao, “Drivegpt4: Interpretable end-to-end autonomous driving via large language model”, arXiv preprint arXiv:2310.01412, 2023.
  5. H. Sha, et al, “Languagempc: Large language models as decision makers for autonomous driving”, arXiv preprint arXiv:2310.03026, 2023.
  6. L. Wen, D. Fu, et al, “DiLu: A Knowledge-Driven Approach to Autonomous Driving with Large Language Models”, arXiv preprint arXiv:2309.16292, 2023.
  7. D. Fu, et al, “Drive Like a Human: Rethinking Autonomous Driving with Large Language Models”, Proc. WACV’24, 2024.
  8. C Cui, Y Ma, X Cao, W Ye, Z Wang , “Receive, reason, and react: Drive as you say, with large language models in autonomous vehicles”, IEEE Intelligent Transportation Systems Magazine, 2024.
  9. X Zhou, M Liu, BL Zagar, E Yurtsever, AC Knoll, “Vision language models in autonomous driving and intelligent transportation systems”, arXiv preprint arXiv:2310.14414, 2023.
  10. C. Cui, et al, “A survey on multimodal large language models for autonomous driving”, Proc. WACV’24, 2024.
  11. “The Official DVSA Theory Test for Car Drivers”, Driver & Vehicle Standard Agency (DVSA), 2024.
  12. S. Hu, et al, “MiniCPM: Unveiling the Potential of Small Language Models with Scalable Training Strategies”, arXiv preprint, arXiv:2404.06395, 2024.
Citations (1)

Summary

We haven't generated a summary for this paper yet.