Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
143 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

AD4RL: Autonomous Driving Benchmarks for Offline Reinforcement Learning with Value-based Dataset (2404.02429v1)

Published 3 Apr 2024 in cs.LG and cs.AI

Abstract: Offline reinforcement learning has emerged as a promising technology by enhancing its practicality through the use of pre-collected large datasets. Despite its practical benefits, most algorithm development research in offline reinforcement learning still relies on game tasks with synthetic datasets. To address such limitations, this paper provides autonomous driving datasets and benchmarks for offline reinforcement learning research. We provide 19 datasets, including real-world human driver's datasets, and seven popular offline reinforcement learning algorithms in three realistic driving scenarios. We also provide a unified decision-making process model that can operate effectively across different scenarios, serving as a reference framework in algorithm design. Our research lays the groundwork for further collaborations in the community to explore practical aspects of existing reinforcement learning methods. Dataset and codes can be found in https://sites.google.com/view/ad4rl.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (40)
  1. S. Pouyanfar, S. Sadiq, Y. Yan, H. Tian, Y. Tao, M. P. Reyes, M. Shyu, S. Chen, and S. Iyengar, “A survey on deep learning: Algorithms, techniques, and applications,” ACM Computing Surveys, vol. 51, no. 5, pp. 1–36, 2018.
  2. A. Kamilaris and F. X. Prenafeta-Boldú, “Deep learning in agriculture: A survey,” Computers and Electronics in Agriculture, vol. 147, pp. 70–90, 2018.
  3. V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski, et al., “Human-level control through deep reinforcement learning,” Nature, vol. 518, no. 7540, pp. 529–533, 2015.
  4. H. Niu, Y. Qiu, M. Li, G. Zhou, J. HU, X. Zhan, et al., “When to trust your simulator: Dynamics-aware hybrid offline-and-online reinforcement learning,” in Neural inf. process. syst., vol. 35, 2022, pp. 36 599–36 612.
  5. S. Levine, A. Kumar, G. Tucker, and J. Fu, “Offline reinforcement learning: Tutorial, review, and perspectives on open problems,” arXiv preprint arXiv:2005.01643, 2020.
  6. S. Lange, T. Gabel, and M. Riedmiller, “Batch reinforcement learning,” Reinforcement learning, pp. 45–73, 2012.
  7. X. Fang, Q. Zhang, Y. Gao, and D. Zhao, “Offline reinforcement learning for autonomous driving with real world driving data,” in IEEE Intell. Transp. Syst. Conf., 2022, pp. 3417–3422.
  8. X. Liang, T. Wang, L. Yang, and E. Xing, “CIRL: Controllable imitative reinforcement learning for vision-based self-driving,” in Eur. Conf. on Comput. Vision, 2018, pp. 584–599.
  9. A. Kumar, A. Singh, S. Tian, C. Finn, and S. Levine, “A workflow for offline model-free robotic reinforcement learning,” in Conf. on Robot Learn., 2022, pp. 417–428.
  10. B. R. Kiran, I. Sobh, V. Talpaert, P. Mannion, A. A. Al Sallab, S. Yogamani, and P. Pérez, “Deep reinforcement learning for autonomous driving: A survey,” IEEE Transactions on Intelligent Transportation Systems, vol. 23, no. 6, pp. 4909–4926, 2021.
  11. J. Fu, A. Kumar, O. Nachum, G. Tucker, and S. Levine, “D4RL: Datasets for deep data-driven reinforcement learning,” arXiv preprint arXiv:2004.07219, 2020.
  12. T. Shi, D. Chen, K. Chen, and Z. Li, “Offline reinforcement learning for autonomous driving with safety and exploration enhancement,” arXiv preprint arXiv:2110.07067, 2021.
  13. E. Vinitsky, A. Kreidieh, L. Le Flem, N. Kheterpal, K. Jang, C. Wu, F. Wu, R. Liaw, E. Liang, and A. M. Bayen, “Benchmarks for reinforcement learning in mixed-autonomy traffic,” in Conf. on Robot Learn., 2018, pp. 399–409.
  14. A. R. Kreidieh, C. Wu, and A. M. Bayen, “Dissipating stop-and-go waves in closed and open networks via deep reinforcement learning,” in IEEE Intell. Transp. Syst. Conf., 2018, pp. 1475–1480.
  15. C. Gong, Z. Yang, Y. Bai, J. He, J. Shi, A. Sinha, B. Xu, X. Hou, G. Fan, and D. Lo, “Mind your data! Hiding backdoors in offline reinforcement learning datasets,” arXiv preprint arXiv:2210.04688, 2022.
  16. U. S. Department of Transportation Federal Highway Administration, “Next generation simulation (NGSIM) program US-101 videos. [Dataset]. Provided by ITS DataHub through data.transportation.gov,” 2016, Accessed 2022-06-05 from http://doi.org/10.21949/1504477.
  17. U. D. of Transportation, “NGSIM—next generation simulation,” 2008.
  18. R. Prudencio, M. Maximo, and E. Colombini, “A survey on offline reinforcement learning: Taxonomy, review, and open problems,” IEEE Transactions on Neural Networks and Learning Systems, 2023.
  19. D. Troullinos, G. Chalkiadakis, I. Papamichail, and M. Papageorgiou, “Collaborative multiagent decision making for lane-free autonomous driving,” in Int. Conf. on Auton. Agents and Multi-agent Syst., 2021, pp. 1335–1343.
  20. M. Stryszowski, S. Longo, E. Velenis, and G. Forostovsky, “A framework for self-enforced interaction between connected vehicles: Intersection negotiation,” IEEE Transactions on Intelligent Transportation Systems, vol. 22, no. 11, pp. 6716–6725, 2020.
  21. K. Jang, E. Vinitsky, B. Chalaki, B. Remer, L. Beaver, A. A. Malikopoulos, and A. Bayen, “Simulation to scaled city: zero-shot policy transfer for traffic control via autonomous vehicles,” in Int. Conf. on Cyber-Physical Syst., 2019, pp. 291–300.
  22. C. Wu, A. Kreidieh, K. Parvate, E. Vinitsky, and A. Bayen, “FLOW: A modular learning framework for mixed autonomy traffic,” IEEE Transactions on Robotics, vol. 38, no. 2, pp. 1270–1286, 2021.
  23. Q. Li, Z. Peng, L. Feng, Q. Zhang, Z. Xue, and B. Zhou, “Metadrive: Composing diverse driving scenarios for generalizable reinforcement learning,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022.
  24. S. Fujimoto and S. Gu, “A minimalist approach to offline reinforcement learning,” in Neural inf. process. syst., vol. 34, 2021, pp. 20 132–20 145.
  25. T. Osa, J. Pajarinen, G. Neumann, A. Bagnell, P. Abbeel, J. Peters, et al., “An algorithmic perspective on imitation learning,” Foundations and Trends in Robotics, vol. 7, no. 1-2, pp. 1–179, 2018.
  26. S. Fujimoto, D. Meger, and D. Precup, “Off-policy deep reinforcement learning without exploration,” in Int. Conf. on Mach. Learn., 2019, pp. 2052–2062.
  27. A. Kumar, J. Fu, M. Soh, G. Tucker, and S. Levine, “Stabilizing off-policy Q-learning via bootstrapping error reduction,” in Neural inf. process. syst., vol. 32, 2019.
  28. A. Kumar, A. Zhou, G. Tucker, and S. Levine, “Conservative Q-learning for offline reinforcement learning,” in Neural inf. process. syst., vol. 33, 2020, pp. 1179–1191.
  29. I. Kostrikov, A. Nair, and S. Levine, “Offline reinforcement learning with implicit Q-learning,” in Int. Conf. on Learn. Representations, 2022.
  30. R. McAllister, B. Wulfe, J. Mercat, L. Ellis, S. Levine, and A. Gaidon, “Control-aware prediction objectives for autonomous driving,” in IEEE Int. Conf. on Robot. and Automat., 2022, pp. 01–08.
  31. S. Pini, C. Perone, A. Ahuja, A. Ferreira, M. Niendorf, and S. Zagoruyko, “Safe real-world autonomous driving by learning to predict and plan with a mixture of experts,” in IEEE Int. Conf. on Robot. and Automat., 2023, pp. 10 069–10 075.
  32. H. Chiu, J. Li, R. Ambruş, and J. Bohg, “Probabilistic 3D multi-modal, multi-object tracking for autonomous driving,” in IEEE Int. Conf. on Robot. and Automat., 2021, pp. 14 227–14 233.
  33. N. Kheterpal, E. Vinitsky, C. Wu, A. Kreidieh, K. Jang, K. Parvate, and A. Bayen, “FLOW: Open source reinforcement learning for traffic control,” in Neural inf. process. syst., 2018.
  34. T. Lillicrap, J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, and D. Wierstra, “Continuous control with deep reinforcement learning,” in Int. Conf. on Learn. Representations, 2016.
  35. M. Treiber, A. Hennecke, and D. Helbing, “Congested traffic states in empirical observations and microscopic simulations,” Physical Review E, vol. 62, no. 2, p. 1805, 2000.
  36. J. Erdmann, “SUMO’s lane-changing model,” Modeling Mobility with Open Data, pp. 105–123, 2015.
  37. G. An, S. Moon, J.-H. Kim, and H. O. Song, “Uncertainty-based offline reinforcement learning with diversified Q-ensemble,” in Neural inf. process. syst., vol. 34, 2021, pp. 7436–7447.
  38. W. Zhou, S. Bajracharya, and D. Held, “PLAS: Latent action space for offline reinforcement learning,” in Conf. on Robot Learn., 2021, pp. 1719–1735.
  39. S. Fujimoto, H. Hoof, and D. Meger, “Addressing function approximation error in actor-critic methods,” in Int. Conf. on Mach. Learn., 2018, pp. 1587–1596.
  40. R. Agarwal, M. Schwarzer, P. S. Castro, A. Courville, and M. Bellemare, “Deep reinforcement learning at the edge of the statistical precipice,” in Neural inf. process. syst., vol. 34, 2021, pp. 29 304–29 320.
Citations (3)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets