Interactive Autonomous Navigation with Internal State Inference and Interactivity Estimation (2311.16091v1)
Abstract: Deep reinforcement learning (DRL) provides a promising way for intelligent agents (e.g., autonomous vehicles) to learn to navigate complex scenarios. However, DRL with neural networks as function approximators is typically considered a black box with little explainability and often suffers from suboptimal performance, especially for autonomous navigation in highly interactive multi-agent environments. To address these issues, we propose three auxiliary tasks with spatio-temporal relational reasoning and integrate them into the standard DRL framework, which improves the decision making performance and provides explainable intermediate indicators. We propose to explicitly infer the internal states (i.e., traits and intentions) of surrounding agents (e.g., human drivers) as well as to predict their future trajectories in the situations with and without the ego agent through counterfactual reasoning. These auxiliary tasks provide additional supervision signals to infer the behavior patterns of other interactive agents. Multiple variants of framework integration strategies are compared. We also employ a spatio-temporal graph neural network to encode relations between dynamic entities, which enhances both internal state inference and decision making of the ego agent. Moreover, we propose an interactivity estimation mechanism based on the difference between predicted trajectories in these two situations, which indicates the degree of influence of the ego agent on other agents. To validate the proposed method, we design an intersection driving simulator based on the Intelligent Intersection Driver Model (IIDM) that simulates vehicles and pedestrians. Our approach achieves robust and state-of-the-art performance in terms of standard evaluation metrics and provides explainable intermediate indicators (i.e., internal states, and interactivity scores) for decision making.
- B. Zhu, Y. Jiang, J. Zhao, R. He, N. Bian, and W. Deng, “Typical-driving-style-oriented personalized adaptive cruise control design based on human driving data,” Transportation Research Part C: Emerging Technologies, vol. 100, pp. 274–288, 2019.
- K. Brown, K. Driggs-Campbell, and M. J. Kochenderfer, “A taxonomy and review of algorithms for modeling and predicting human driver behavior,” arXiv preprint arXiv:2006.08832, 2020.
- X. Ma, J. Li, M. J. Kochenderfer, D. Isele, and K. Fujimura, “Reinforcement learning for autonomous driving with latent state inference and spatial-temporal relationships,” in IEEE International Conference on Robotics and Automation (ICRA), 2021, pp. 6064–6071.
- S. Bae, D. Isele, A. Nakhaei, P. Xu, A. M. Añon, C. Choi, K. Fujimura, and S. Moura, “Lane-change in dense traffic with model predictive control and neural networks,” IEEE Transactions on Control Systems Technology, vol. 31, no. 2, pp. 646–659, 2022.
- A. Vemula, K. Muelling, and J. Oh, “Social attention: Modeling attention in human crowds,” in IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2018, pp. 4601–4607.
- Y. Huang, H. Bi, Z. Li, T. Mao, and Z. Wang, “Stgat: Modeling spatial-temporal interactions for human trajectory prediction,” in International Conference on Computer Vision (ICCV), 2019, pp. 6272–6281.
- V. Kosaraju, A. Sadeghian, R. Martín-Martín, I. Reid, H. Rezatofighi, and S. Savarese, “Social-bigat: Multimodal trajectory forecasting using bicycle-gan and graph attention networks,” Advances in Neural Information Processing Systems (NeurIPS), vol. 32, 2019.
- J. Li, F. Yang, H. Ma, S. Malla, M. Tomizuka, and C. Choi, “Rain: Reinforced hybrid attention inference network for motion forecasting,” in International Conference on Computer Vision (ICCV), 2021.
- J. Li, F. Yang, M. Tomizuka, and C. Choi, “Evolvegraph: Multi-agent trajectory prediction with dynamic relational reasoning,” in Advances in Neural Information Processing Systems (NeurIPS), 2020.
- T. Kipf, E. Fetaya, K.-C. Wang, M. Welling, and R. Zemel, “Neural relational inference for interacting systems,” in International Conference on Machine Learning (ICML). PMLR, 2018, pp. 2688–2697.
- L. Li, J. Yao, L. Wenliang, T. He, T. Xiao, and et al., “Grin: Generative relation and intention network for multi-agent trajectory prediction,” in Advances in Neural Information Processing Systems (NeurIPS), 2021.
- R. Girgis, F. Golemo, F. Codevilla, M. Weiss, J. A. D’Souza, S. E. Kahou, F. Heide, and C. Pal, “Latent variable sequential set transformers for joint multi-agent motion prediction,” in International Conference on Learning Representations (ICLR), 2021.
- S. Aradi, “Survey of deep reinforcement learning for motion planning of autonomous vehicles,” IEEE Transactions on Intelligent Transportation Systems, vol. 23, no. 2, pp. 740–759, 2022.
- L. Wei, Z. Li, J. Gong, C. Gong, and J. Li, “Autonomous driving strategies at intersections: Scenarios, state-of-the-art, and future outlooks,” in IEEE International Conference on Intelligent Transportation Systems (ITSC), 2021.
- D. Omeiza, H. Webb, M. Jirotka, and L. Kunze, “Explanations in autonomous driving: A survey,” IEEE Transactions on Intelligent Transportation Systems, vol. 23, no. 8, pp. 10 142–10 162, 2022.
- S. Martin, S. Vora, K. Yuen, and M. M. Trivedi, “Dynamics of driver’s gaze: Explorations in behavior modeling and maneuver prediction,” IEEE Trans. on Intelligent Vehicles, vol. 3, no. 2, pp. 141–150, 2018.
- J. Li, H. Ma, Z. Zhang, J. Li, and M. Tomizuka, “Spatio-temporal graph dual-attention network for multi-agent prediction and tracking,” IEEE Transactions on Intelligent Transportation Systems, vol. 23, no. 8, pp. 10 556–10 569, 2021.
- G. T. Wilfong, “Motion planning for an autonomous vehicle,” in Autonomous Robot Vehicles. Springer, 1990, pp. 391–395.
- D. Ferguson, T. M. Howard, and M. Likhachev, “Motion planning in urban environments,” Journal of Field Robotics, vol. 25, no. 11-12, pp. 939–960, 2008.
- S. J. Anderson, S. C. Peters, T. E. Pilutti, and K. Iagnemma, “An optimal-control-based framework for trajectory planning, threat assessment, and semi-autonomous control of passenger vehicles in hazard avoidance scenarios,” International Journal of Vehicle Autonomous Systems, vol. 8, no. 2-4, pp. 190–216, 2010.
- J. Wei, J. M. Dolan, J. M. Snider, and B. Litkouhi, “A point-based mdp for robust single-lane autonomous driving behavior under uncertainties,” in IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2011, pp. 2586–2592.
- E. Frazzoli, M. A. Dahleh, and E. Feron, “Robust hybrid control for autonomous vehicle motion planning,” in IEEE Conference on Decision and Control (CDC), vol. 1. IEEE, 2000, pp. 821–826.
- S. Waydo and R. M. Murray, “Vehicle motion planning using stream functions,” in IEEE International Conference on Robotics and Automation (ICRA), vol. 2. IEEE, 2003, pp. 2484–2491.
- Y. Zhang, C. G. Cassandras, and A. A. Malikopoulos, “Optimal control of connected automated vehicles at urban traffic intersections: A feasibility enforcement analysis,” in 2017 American Control Conference (ACC). IEEE, 2017, pp. 3548–3553.
- J. F. Fisac, E. Bronstein, E. Stefansson, D. Sadigh, S. S. Sastry, and A. D. Dragan, “Hierarchical game-theoretic planning for autonomous vehicles,” in IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2019, pp. 9590–9596.
- M. Wang, Z. Wang, J. Talbot, and et al., “Game-theoretic planning for self-driving cars in multivehicle competitive scenarios,” IEEE Transactions on Robotics, vol. 37, no. 4, pp. 1313–1325, 2021.
- P. Hang, C. Lv, Y. Xing, C. Huang, and Z. Hu, “Human-like decision making for autonomous driving: A noncooperative game theoretic approach,” IEEE Transactions on Intelligent Transportation Systems, vol. 22, no. 4, pp. 2076–2087, 2020.
- K. Li, Y. Chen, M. Shan, J. Li, S. Worrall, and E. Nebot, “Game theory-based simultaneous prediction and planning for autonomous vehicle navigation in crowded environments,” in IEEE International Conference on Intelligent Transportation Systems (ITSC), 2023.
- Y. Pan, C.-A. Cheng, K. Saigol, K. Lee, X. Yan, E. Theodorou, and B. Boots, “Agile autonomous driving using end-to-end deep imitation learning,” in Robotics: Science and Systems, 2018.
- V. François-Lavet, P. Henderson, R. Islam, M. G. Bellemare, J. Pineau et al., “An introduction to deep reinforcement learning,” Foundations and Trends® in Machine Learning, vol. 11, no. 3-4, pp. 219–354, 2018.
- F. Codevilla, M. Müller, A. López, V. Koltun, and A. Dosovitskiy, “End-to-end driving via conditional imitation learning,” in IEEE International Conference on Robotics and Automation (ICRA), 2018, pp. 4693–4700.
- K. Chitta, A. Prakash, and A. Geiger, “Neat: Neural attention fields for end-to-end autonomous driving,” in International Conference on Computer Vision (ICCV), 2021, pp. 15 793–15 803.
- A. Amini, I. Gilitschenski, J. Phillips, J. Moseyko, R. Banerjee, S. Karaman, and D. Rus, “Learning robust control policies for end-to-end autonomous driving from data-driven simulation,” IEEE Robotics and Automation Letters, vol. 5, no. 2, pp. 1143–1150, 2020.
- Z. Zhang, A. Liniger, D. Dai, F. Yu, and L. Van Gool, “End-to-end urban driving by imitating a reinforcement learning coach,” in International Conference on Computer Vision (ICCV), 2021, pp. 15 222–15 232.
- K. Lee, J. Li, D. Isele, J. Park, K. Fujimura, and M. J. Kochenderfer, “Robust driving policy learning with guided meta reinforcement learning,” in IEEE International Conference on Intelligent Transportation Systems (ITSC), 2023.
- D. M. Saxena, S. Bae, A. Nakhaei, K. Fujimura, and M. Likhachev, “Driving in dense traffic with model-free reinforcement learning,” in IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2020, pp. 5385–5392.
- D. Isele, R. Rahimi, A. Cosgun, K. Subramanian, and K. Fujimura, “Navigating occluded intersections with autonomous vehicles using deep reinforcement learning,” in IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2018, pp. 2034–2039.
- Y. Guan, Y. Ren, Q. Sun, S. E. Li, H. Ma, J. Duan, Y. Dai, and B. Cheng, “Integrated decision and control: Toward interpretable and computationally efficient driving intelligence,” IEEE transactions on cybernetics, vol. 53, no. 2, pp. 859–873, 2022.
- P. Hang, C. Lv, C. Huang, J. Cai, Z. Hu, and Y. Xing, “An integrated framework of decision making and motion planning for autonomous vehicles considering social behaviors,” IEEE Transactions on Vehicular Technology, vol. 69, no. 12, pp. 14 458–14 469, 2020.
- N. Li, D. Oyler, M. Zhang, Y. Yildiz, A. Girard, and I. Kolmanovsky, “Hierarchical reasoning game theory based approach for evaluation and testing of autonomous vehicle control systems,” in IEEE Conference on Decision and Control (CDC). IEEE, 2016, pp. 727–733.
- J. Liu, W. Zeng, R. Urtasun, and E. Yumer, “Deep structured reactive planning,” in 2021 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2021, pp. 4897–4904.
- N. Rabinowitz, F. Perbet, F. Song, C. Zhang, S. A. Eslami, and M. Botvinick, “Machine theory of mind,” in International Conference on Machine Learning (ICML). PMLR, 2018, pp. 4218–4227.
- P. Hernandez-Leal, B. Kartal, and M. E. Taylor, “Agent modeling as auxiliary task for deep reinforcement learning,” in AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, 2019.
- G. Papoudakis, F. Christianos, and S. Albrecht, “Agent modelling under partial observability for deep reinforcement learning,” in Advances in Neural Information Processing Systems (NeurIPS), 2021.
- H. Wang, H. Gao, S. Yuan, H. Zhao, and et al., “Interpretable decision-making for autonomous vehicles at highway on-ramps with latent space reinforcement learning,” IEEE Transactions on Vehicular Technology, vol. 70, no. 9, pp. 8707–8719, 2021.
- A. Xie, D. Losey, R. Tolsma, C. Finn, and D. Sadigh, “Learning latent representations to influence multi-agent interaction,” in Conference on Robot Learning (CoRL), 2020.
- L. Zintgraf, S. Devlin, K. Ciosek, S. Whiteson, and K. Hofmann, “Deep interactive bayesian reinforcement learning via meta-learning,” in International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2021, p. 1712–1714.
- J. Li, W. Zhan, Y. Hu, and M. Tomizuka, “Generic tracking and probabilistic prediction framework and its application in autonomous driving,” IEEE Transactions on Intelligent Transportation Systems, vol. 21, no. 9, pp. 3634–3649, 2019.
- J. Gu, C. Sun, and H. Zhao, “Densetnt: End-to-end trajectory prediction from dense goal sets,” in International Conference on Computer Vision (ICCV), 2021, pp. 15 303–15 312.
- S. Fan, X. Li, and F. Li, “Intention-driven trajectory prediction for autonomous driving,” in IEEE Intelligent Vehicles Symposium (IV). IEEE, 2021, pp. 107–113.
- H. Girase, H. Gang, S. Malla, J. Li, A. Kanehara, K. Mangalam, and C. Choi, “Loki: Long-term and key intentions for trajectory prediction,” in International Conference on Computer Vision (ICCV), 2021.
- R. Zhou, H. Zhou, H. Gao, M. Tomizuka, J. Li, and Z. Xu, “Grouptron: Dynamic multi-scale graph convolutional networks for group-aware dense crowd trajectory forecasting,” in 2022 International Conference on Robotics and Automation (ICRA). IEEE, 2022, pp. 805–811.
- J. Li, X. Shi, F. Chen, J. Stroud, and et al., “Pedestrian crossing action recognition and trajectory prediction with 3d human keypoints,” in IEEE International Conference on Robotics and Automation (ICRA), 2023.
- S. J. Hoch, “Counterfactual reasoning and accuracy in predicting personal events.” Journal of Experimental Psychology: Learning, Memory, and Cognition, vol. 11, no. 4, p. 719, 1985.
- N. Jaques, A. Lazaridou, E. Hughes, C. Gulcehre, P. Ortega, D. Strouse, J. Z. Leibo, and N. De Freitas, “Social influence as intrinsic motivation for multi-agent deep reinforcement learning,” in International Conference on Machine Learning (ICML). PMLR, 2019, pp. 3040–3049.
- E. Tolstaya, R. Mahjourian, C. Downey, B. Vadarajan, B. Sapp, and D. Anguelov, “Identifying driver interactions via conditional behavior prediction,” in IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2021, pp. 3473–3479.
- S. Khandelwal, W. Qi, J. Singh, A. Hartnett, and D. Ramanan, “What-if motion prediction for autonomous driving,” arXiv preprint arXiv:2008.10587, 2020.
- R. J. Williams, “Simple statistical gradient-following algorithms for connectionist reinforcement learning,” Machine Learning, vol. 8, no. 3, pp. 229–256, 1992.
- J. Schulman, S. Levine, P. Abbeel, M. Jordan, and P. Moritz, “Trust region policy optimization,” in International Conference on Machine Learning (ICML). PMLR, 2015, pp. 1889–1897.
- J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, “Proximal policy optimization algorithms,” arXiv preprint arXiv:1707.06347.
- P. W. Battaglia, J. B. Hamrick, V. Bapst, A. Sanchez-Gonzalez, V. Zambaldi, M. Malinowski, A. Tacchetti, D. Raposo, A. Santoro, R. Faulkner et al., “Relational inductive biases, deep learning, and graph networks,” arXiv preprint arXiv:1806.01261, 2018.
- M. Treiber, A. Hennecke, and D. Helbing, “Congested traffic states in empirical observations and microscopic simulations,” Physical Review E, vol. 62, no. 2, p. 1805, 2000.
- M. Ahmadreza, S. Bae, M. Alizadeh, E. M. Pari, and D. Isele, “Predicting parameters for modeling traffic participants,” in IEEE Intelligent Vehicles Symposium (IV). IEEE, 2022.
- P. Veličković, G. Cucurull, A. Casanova, A. Romero, P. Liò, and Y. Bengio, “Graph attention networks,” in International Conference on Learning Representations (ICLR), 2018.
- T. N. Kipf and M. Welling, “Semi-supervised classification with graph convolutional networks,” in International Conference on Learning Representations (ICLR), 2017.
- W. Hamilton, Z. Ying, and J. Leskovec, “Inductive representation learning on large graphs,” Advances in Neural Information Processing Systems (NeurIPS), vol. 30, 2017.
- J. Morton and M. J. Kochenderfer, “Simultaneous policy learning and latent state inference for imitating driver behavior,” in International conference on intelligent transportation systems (ITSC), 2017.
- S. Liu, P. Chang, H. Chen, N. Chakraborty, and K. Driggs-Campbell, “Learning to navigate intersections with unsupervised driver trait inference,” in IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2022, pp. 3576–3582.
- R. J. Williams and D. Zipser, “A learning algorithm for continually running fully recurrent neural networks,” Neural Computation, vol. 1, no. 2, pp. 270–280, 1989.
- B. Yu, H. Yin, and Z. Zhu, “Spatio-temporal graph convolutional networks: a deep learning framework for traffic forecasting,” in International Joint Conference on Artificial Intelligence (IJCAI), 2018.
- Jiachen Li (144 papers)
- David Isele (38 papers)
- Kanghoon Lee (36 papers)
- Jinkyoo Park (75 papers)
- Kikuo Fujimura (22 papers)
- Mykel J. Kochenderfer (215 papers)