Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Interactive Autonomous Navigation with Internal State Inference and Interactivity Estimation (2311.16091v1)

Published 27 Nov 2023 in cs.RO, cs.AI, cs.CV, cs.LG, and cs.MA

Abstract: Deep reinforcement learning (DRL) provides a promising way for intelligent agents (e.g., autonomous vehicles) to learn to navigate complex scenarios. However, DRL with neural networks as function approximators is typically considered a black box with little explainability and often suffers from suboptimal performance, especially for autonomous navigation in highly interactive multi-agent environments. To address these issues, we propose three auxiliary tasks with spatio-temporal relational reasoning and integrate them into the standard DRL framework, which improves the decision making performance and provides explainable intermediate indicators. We propose to explicitly infer the internal states (i.e., traits and intentions) of surrounding agents (e.g., human drivers) as well as to predict their future trajectories in the situations with and without the ego agent through counterfactual reasoning. These auxiliary tasks provide additional supervision signals to infer the behavior patterns of other interactive agents. Multiple variants of framework integration strategies are compared. We also employ a spatio-temporal graph neural network to encode relations between dynamic entities, which enhances both internal state inference and decision making of the ego agent. Moreover, we propose an interactivity estimation mechanism based on the difference between predicted trajectories in these two situations, which indicates the degree of influence of the ego agent on other agents. To validate the proposed method, we design an intersection driving simulator based on the Intelligent Intersection Driver Model (IIDM) that simulates vehicles and pedestrians. Our approach achieves robust and state-of-the-art performance in terms of standard evaluation metrics and provides explainable intermediate indicators (i.e., internal states, and interactivity scores) for decision making.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (70)
  1. B. Zhu, Y. Jiang, J. Zhao, R. He, N. Bian, and W. Deng, “Typical-driving-style-oriented personalized adaptive cruise control design based on human driving data,” Transportation Research Part C: Emerging Technologies, vol. 100, pp. 274–288, 2019.
  2. K. Brown, K. Driggs-Campbell, and M. J. Kochenderfer, “A taxonomy and review of algorithms for modeling and predicting human driver behavior,” arXiv preprint arXiv:2006.08832, 2020.
  3. X. Ma, J. Li, M. J. Kochenderfer, D. Isele, and K. Fujimura, “Reinforcement learning for autonomous driving with latent state inference and spatial-temporal relationships,” in IEEE International Conference on Robotics and Automation (ICRA), 2021, pp. 6064–6071.
  4. S. Bae, D. Isele, A. Nakhaei, P. Xu, A. M. Añon, C. Choi, K. Fujimura, and S. Moura, “Lane-change in dense traffic with model predictive control and neural networks,” IEEE Transactions on Control Systems Technology, vol. 31, no. 2, pp. 646–659, 2022.
  5. A. Vemula, K. Muelling, and J. Oh, “Social attention: Modeling attention in human crowds,” in IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2018, pp. 4601–4607.
  6. Y. Huang, H. Bi, Z. Li, T. Mao, and Z. Wang, “Stgat: Modeling spatial-temporal interactions for human trajectory prediction,” in International Conference on Computer Vision (ICCV), 2019, pp. 6272–6281.
  7. V. Kosaraju, A. Sadeghian, R. Martín-Martín, I. Reid, H. Rezatofighi, and S. Savarese, “Social-bigat: Multimodal trajectory forecasting using bicycle-gan and graph attention networks,” Advances in Neural Information Processing Systems (NeurIPS), vol. 32, 2019.
  8. J. Li, F. Yang, H. Ma, S. Malla, M. Tomizuka, and C. Choi, “Rain: Reinforced hybrid attention inference network for motion forecasting,” in International Conference on Computer Vision (ICCV), 2021.
  9. J. Li, F. Yang, M. Tomizuka, and C. Choi, “Evolvegraph: Multi-agent trajectory prediction with dynamic relational reasoning,” in Advances in Neural Information Processing Systems (NeurIPS), 2020.
  10. T. Kipf, E. Fetaya, K.-C. Wang, M. Welling, and R. Zemel, “Neural relational inference for interacting systems,” in International Conference on Machine Learning (ICML).   PMLR, 2018, pp. 2688–2697.
  11. L. Li, J. Yao, L. Wenliang, T. He, T. Xiao, and et al., “Grin: Generative relation and intention network for multi-agent trajectory prediction,” in Advances in Neural Information Processing Systems (NeurIPS), 2021.
  12. R. Girgis, F. Golemo, F. Codevilla, M. Weiss, J. A. D’Souza, S. E. Kahou, F. Heide, and C. Pal, “Latent variable sequential set transformers for joint multi-agent motion prediction,” in International Conference on Learning Representations (ICLR), 2021.
  13. S. Aradi, “Survey of deep reinforcement learning for motion planning of autonomous vehicles,” IEEE Transactions on Intelligent Transportation Systems, vol. 23, no. 2, pp. 740–759, 2022.
  14. L. Wei, Z. Li, J. Gong, C. Gong, and J. Li, “Autonomous driving strategies at intersections: Scenarios, state-of-the-art, and future outlooks,” in IEEE International Conference on Intelligent Transportation Systems (ITSC), 2021.
  15. D. Omeiza, H. Webb, M. Jirotka, and L. Kunze, “Explanations in autonomous driving: A survey,” IEEE Transactions on Intelligent Transportation Systems, vol. 23, no. 8, pp. 10 142–10 162, 2022.
  16. S. Martin, S. Vora, K. Yuen, and M. M. Trivedi, “Dynamics of driver’s gaze: Explorations in behavior modeling and maneuver prediction,” IEEE Trans. on Intelligent Vehicles, vol. 3, no. 2, pp. 141–150, 2018.
  17. J. Li, H. Ma, Z. Zhang, J. Li, and M. Tomizuka, “Spatio-temporal graph dual-attention network for multi-agent prediction and tracking,” IEEE Transactions on Intelligent Transportation Systems, vol. 23, no. 8, pp. 10 556–10 569, 2021.
  18. G. T. Wilfong, “Motion planning for an autonomous vehicle,” in Autonomous Robot Vehicles.   Springer, 1990, pp. 391–395.
  19. D. Ferguson, T. M. Howard, and M. Likhachev, “Motion planning in urban environments,” Journal of Field Robotics, vol. 25, no. 11-12, pp. 939–960, 2008.
  20. S. J. Anderson, S. C. Peters, T. E. Pilutti, and K. Iagnemma, “An optimal-control-based framework for trajectory planning, threat assessment, and semi-autonomous control of passenger vehicles in hazard avoidance scenarios,” International Journal of Vehicle Autonomous Systems, vol. 8, no. 2-4, pp. 190–216, 2010.
  21. J. Wei, J. M. Dolan, J. M. Snider, and B. Litkouhi, “A point-based mdp for robust single-lane autonomous driving behavior under uncertainties,” in IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2011, pp. 2586–2592.
  22. E. Frazzoli, M. A. Dahleh, and E. Feron, “Robust hybrid control for autonomous vehicle motion planning,” in IEEE Conference on Decision and Control (CDC), vol. 1.   IEEE, 2000, pp. 821–826.
  23. S. Waydo and R. M. Murray, “Vehicle motion planning using stream functions,” in IEEE International Conference on Robotics and Automation (ICRA), vol. 2.   IEEE, 2003, pp. 2484–2491.
  24. Y. Zhang, C. G. Cassandras, and A. A. Malikopoulos, “Optimal control of connected automated vehicles at urban traffic intersections: A feasibility enforcement analysis,” in 2017 American Control Conference (ACC).   IEEE, 2017, pp. 3548–3553.
  25. J. F. Fisac, E. Bronstein, E. Stefansson, D. Sadigh, S. S. Sastry, and A. D. Dragan, “Hierarchical game-theoretic planning for autonomous vehicles,” in IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2019, pp. 9590–9596.
  26. M. Wang, Z. Wang, J. Talbot, and et al., “Game-theoretic planning for self-driving cars in multivehicle competitive scenarios,” IEEE Transactions on Robotics, vol. 37, no. 4, pp. 1313–1325, 2021.
  27. P. Hang, C. Lv, Y. Xing, C. Huang, and Z. Hu, “Human-like decision making for autonomous driving: A noncooperative game theoretic approach,” IEEE Transactions on Intelligent Transportation Systems, vol. 22, no. 4, pp. 2076–2087, 2020.
  28. K. Li, Y. Chen, M. Shan, J. Li, S. Worrall, and E. Nebot, “Game theory-based simultaneous prediction and planning for autonomous vehicle navigation in crowded environments,” in IEEE International Conference on Intelligent Transportation Systems (ITSC), 2023.
  29. Y. Pan, C.-A. Cheng, K. Saigol, K. Lee, X. Yan, E. Theodorou, and B. Boots, “Agile autonomous driving using end-to-end deep imitation learning,” in Robotics: Science and Systems, 2018.
  30. V. François-Lavet, P. Henderson, R. Islam, M. G. Bellemare, J. Pineau et al., “An introduction to deep reinforcement learning,” Foundations and Trends® in Machine Learning, vol. 11, no. 3-4, pp. 219–354, 2018.
  31. F. Codevilla, M. Müller, A. López, V. Koltun, and A. Dosovitskiy, “End-to-end driving via conditional imitation learning,” in IEEE International Conference on Robotics and Automation (ICRA), 2018, pp. 4693–4700.
  32. K. Chitta, A. Prakash, and A. Geiger, “Neat: Neural attention fields for end-to-end autonomous driving,” in International Conference on Computer Vision (ICCV), 2021, pp. 15 793–15 803.
  33. A. Amini, I. Gilitschenski, J. Phillips, J. Moseyko, R. Banerjee, S. Karaman, and D. Rus, “Learning robust control policies for end-to-end autonomous driving from data-driven simulation,” IEEE Robotics and Automation Letters, vol. 5, no. 2, pp. 1143–1150, 2020.
  34. Z. Zhang, A. Liniger, D. Dai, F. Yu, and L. Van Gool, “End-to-end urban driving by imitating a reinforcement learning coach,” in International Conference on Computer Vision (ICCV), 2021, pp. 15 222–15 232.
  35. K. Lee, J. Li, D. Isele, J. Park, K. Fujimura, and M. J. Kochenderfer, “Robust driving policy learning with guided meta reinforcement learning,” in IEEE International Conference on Intelligent Transportation Systems (ITSC), 2023.
  36. D. M. Saxena, S. Bae, A. Nakhaei, K. Fujimura, and M. Likhachev, “Driving in dense traffic with model-free reinforcement learning,” in IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2020, pp. 5385–5392.
  37. D. Isele, R. Rahimi, A. Cosgun, K. Subramanian, and K. Fujimura, “Navigating occluded intersections with autonomous vehicles using deep reinforcement learning,” in IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2018, pp. 2034–2039.
  38. Y. Guan, Y. Ren, Q. Sun, S. E. Li, H. Ma, J. Duan, Y. Dai, and B. Cheng, “Integrated decision and control: Toward interpretable and computationally efficient driving intelligence,” IEEE transactions on cybernetics, vol. 53, no. 2, pp. 859–873, 2022.
  39. P. Hang, C. Lv, C. Huang, J. Cai, Z. Hu, and Y. Xing, “An integrated framework of decision making and motion planning for autonomous vehicles considering social behaviors,” IEEE Transactions on Vehicular Technology, vol. 69, no. 12, pp. 14 458–14 469, 2020.
  40. N. Li, D. Oyler, M. Zhang, Y. Yildiz, A. Girard, and I. Kolmanovsky, “Hierarchical reasoning game theory based approach for evaluation and testing of autonomous vehicle control systems,” in IEEE Conference on Decision and Control (CDC).   IEEE, 2016, pp. 727–733.
  41. J. Liu, W. Zeng, R. Urtasun, and E. Yumer, “Deep structured reactive planning,” in 2021 IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2021, pp. 4897–4904.
  42. N. Rabinowitz, F. Perbet, F. Song, C. Zhang, S. A. Eslami, and M. Botvinick, “Machine theory of mind,” in International Conference on Machine Learning (ICML).   PMLR, 2018, pp. 4218–4227.
  43. P. Hernandez-Leal, B. Kartal, and M. E. Taylor, “Agent modeling as auxiliary task for deep reinforcement learning,” in AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, 2019.
  44. G. Papoudakis, F. Christianos, and S. Albrecht, “Agent modelling under partial observability for deep reinforcement learning,” in Advances in Neural Information Processing Systems (NeurIPS), 2021.
  45. H. Wang, H. Gao, S. Yuan, H. Zhao, and et al., “Interpretable decision-making for autonomous vehicles at highway on-ramps with latent space reinforcement learning,” IEEE Transactions on Vehicular Technology, vol. 70, no. 9, pp. 8707–8719, 2021.
  46. A. Xie, D. Losey, R. Tolsma, C. Finn, and D. Sadigh, “Learning latent representations to influence multi-agent interaction,” in Conference on Robot Learning (CoRL), 2020.
  47. L. Zintgraf, S. Devlin, K. Ciosek, S. Whiteson, and K. Hofmann, “Deep interactive bayesian reinforcement learning via meta-learning,” in International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2021, p. 1712–1714.
  48. J. Li, W. Zhan, Y. Hu, and M. Tomizuka, “Generic tracking and probabilistic prediction framework and its application in autonomous driving,” IEEE Transactions on Intelligent Transportation Systems, vol. 21, no. 9, pp. 3634–3649, 2019.
  49. J. Gu, C. Sun, and H. Zhao, “Densetnt: End-to-end trajectory prediction from dense goal sets,” in International Conference on Computer Vision (ICCV), 2021, pp. 15 303–15 312.
  50. S. Fan, X. Li, and F. Li, “Intention-driven trajectory prediction for autonomous driving,” in IEEE Intelligent Vehicles Symposium (IV).   IEEE, 2021, pp. 107–113.
  51. H. Girase, H. Gang, S. Malla, J. Li, A. Kanehara, K. Mangalam, and C. Choi, “Loki: Long-term and key intentions for trajectory prediction,” in International Conference on Computer Vision (ICCV), 2021.
  52. R. Zhou, H. Zhou, H. Gao, M. Tomizuka, J. Li, and Z. Xu, “Grouptron: Dynamic multi-scale graph convolutional networks for group-aware dense crowd trajectory forecasting,” in 2022 International Conference on Robotics and Automation (ICRA).   IEEE, 2022, pp. 805–811.
  53. J. Li, X. Shi, F. Chen, J. Stroud, and et al., “Pedestrian crossing action recognition and trajectory prediction with 3d human keypoints,” in IEEE International Conference on Robotics and Automation (ICRA), 2023.
  54. S. J. Hoch, “Counterfactual reasoning and accuracy in predicting personal events.” Journal of Experimental Psychology: Learning, Memory, and Cognition, vol. 11, no. 4, p. 719, 1985.
  55. N. Jaques, A. Lazaridou, E. Hughes, C. Gulcehre, P. Ortega, D. Strouse, J. Z. Leibo, and N. De Freitas, “Social influence as intrinsic motivation for multi-agent deep reinforcement learning,” in International Conference on Machine Learning (ICML).   PMLR, 2019, pp. 3040–3049.
  56. E. Tolstaya, R. Mahjourian, C. Downey, B. Vadarajan, B. Sapp, and D. Anguelov, “Identifying driver interactions via conditional behavior prediction,” in IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2021, pp. 3473–3479.
  57. S. Khandelwal, W. Qi, J. Singh, A. Hartnett, and D. Ramanan, “What-if motion prediction for autonomous driving,” arXiv preprint arXiv:2008.10587, 2020.
  58. R. J. Williams, “Simple statistical gradient-following algorithms for connectionist reinforcement learning,” Machine Learning, vol. 8, no. 3, pp. 229–256, 1992.
  59. J. Schulman, S. Levine, P. Abbeel, M. Jordan, and P. Moritz, “Trust region policy optimization,” in International Conference on Machine Learning (ICML).   PMLR, 2015, pp. 1889–1897.
  60. J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, “Proximal policy optimization algorithms,” arXiv preprint arXiv:1707.06347.
  61. P. W. Battaglia, J. B. Hamrick, V. Bapst, A. Sanchez-Gonzalez, V. Zambaldi, M. Malinowski, A. Tacchetti, D. Raposo, A. Santoro, R. Faulkner et al., “Relational inductive biases, deep learning, and graph networks,” arXiv preprint arXiv:1806.01261, 2018.
  62. M. Treiber, A. Hennecke, and D. Helbing, “Congested traffic states in empirical observations and microscopic simulations,” Physical Review E, vol. 62, no. 2, p. 1805, 2000.
  63. M. Ahmadreza, S. Bae, M. Alizadeh, E. M. Pari, and D. Isele, “Predicting parameters for modeling traffic participants,” in IEEE Intelligent Vehicles Symposium (IV).   IEEE, 2022.
  64. P. Veličković, G. Cucurull, A. Casanova, A. Romero, P. Liò, and Y. Bengio, “Graph attention networks,” in International Conference on Learning Representations (ICLR), 2018.
  65. T. N. Kipf and M. Welling, “Semi-supervised classification with graph convolutional networks,” in International Conference on Learning Representations (ICLR), 2017.
  66. W. Hamilton, Z. Ying, and J. Leskovec, “Inductive representation learning on large graphs,” Advances in Neural Information Processing Systems (NeurIPS), vol. 30, 2017.
  67. J. Morton and M. J. Kochenderfer, “Simultaneous policy learning and latent state inference for imitating driver behavior,” in International conference on intelligent transportation systems (ITSC), 2017.
  68. S. Liu, P. Chang, H. Chen, N. Chakraborty, and K. Driggs-Campbell, “Learning to navigate intersections with unsupervised driver trait inference,” in IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2022, pp. 3576–3582.
  69. R. J. Williams and D. Zipser, “A learning algorithm for continually running fully recurrent neural networks,” Neural Computation, vol. 1, no. 2, pp. 270–280, 1989.
  70. B. Yu, H. Yin, and Z. Zhu, “Spatio-temporal graph convolutional networks: a deep learning framework for traffic forecasting,” in International Joint Conference on Artificial Intelligence (IJCAI), 2018.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Jiachen Li (144 papers)
  2. David Isele (38 papers)
  3. Kanghoon Lee (36 papers)
  4. Jinkyoo Park (75 papers)
  5. Kikuo Fujimura (22 papers)
  6. Mykel J. Kochenderfer (215 papers)
Citations (4)

Summary

  • The paper introduces a DRL method enhanced with internal state inference to predict and adapt to the intentions and actions of surrounding agents.
  • It employs counterfactual reasoning and graph neural networks to model dynamic interactions in complex, multi-agent environments.
  • Experimental results in a simulated intersection demonstrate superior navigation success, collision avoidance, and driving efficiency compared to baselines.

Enhancing Autonomous Navigation via Intrinsic Behavior Modeling

Introduction to Interactive Autonomous Navigation

Autonomous vehicles (AVs) navigating in urban environments must engage with other road users, such as vehicles and pedestrians, making decisions complex. Traditional Deep Reinforcement Learning (DRL) approaches face challenges in these multi-agent settings, particularly due to their black-box nature and suboptimal performance when interactions are dense and highly dynamic. To improve upon this, researchers have proposed a novel DRL framework that introduces mechanisms for reasoning about the transient intentions and persistent traits of surrounding agents.

Understanding Internal States and Agent Interactions

AVs must be adept at understanding other road users' behaviors which are influenced by latent characteristics – aggressive drivers might take fewer precautions than conservative ones, affecting how AVs should respond. The proposed method infers these internal states and integrates them with DRL. This inference offers a clearer decision-making indicator, as the AV learns to anticipate actions like yielding or acceleration by observing others' behaviors.

To achieve this, the framework employs counterfactual reasoning to consider scenarios both with and without the AV present, thereby understanding how its presence changes others' actions. Additionally, a graph neural network captures the interactions and relationships between agents, enhancing the predictive accuracy and strategic choices of the AV.

Estimating Interactivity Between Agents

Not all agents in the environment equally impact the AV's decision-making process. Recognizing this, interactivity estimation is introduced, measuring how the expected trajectories of other agents shift in the presence of the AV. This metric quantifies the AV's influence over other agents, aiding it in determining when it is crucial to negotiate or when it can safely proceed with less caution.

Experimental Validation and Practical Contributions

Researchers developed an intersection driving simulator based on the Intelligent Intersection Driver Model (IIDM) to validate the approach. It simulates a mix of aggressive and conservative drivers as well as pedestrians, reflecting real-world complexity. The approach outperforms baselines in various metrics including successful navigation completion, collision avoidance, and driving efficiency.

The researchers present three primary contributions:

  1. A DRL method enhanced with internal state inference of surrounding drivers, trajectory predictions, and interactivity estimation for more refined autonomous navigation and richer interpretability of the AV's decision-making process.
  2. A sophisticated simulation environment for testing and benchmarking across different traffic scenarios, focusing particularly on a challenging partially controlled intersection.
  3. Demonstrated robustness and performance; the method shows superior decision-making over current models, providing a strong foundation for future advancements in AV systems.

These advancements signify a step forward in creating AVs that can seamlessly blend into traffic by understanding the nuanced behaviors of human drivers and dynamically adapting to the environment. As AVs become more widespread, the need for such sophisticated, understandable, and reliable decision-making frameworks will become increasingly critical.

X Twitter Logo Streamline Icon: https://streamlinehq.com