Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Pixel State Value Network for Combined Prediction and Planning in Interactive Environments (2310.07706v1)

Published 11 Oct 2023 in cs.RO and cs.AI

Abstract: Automated vehicles operating in urban environments have to reliably interact with other traffic participants. Planning algorithms often utilize separate prediction modules forecasting probabilistic, multi-modal, and interactive behaviors of objects. Designing prediction and planning as two separate modules introduces significant challenges, particularly due to the interdependence of these modules. This work proposes a deep learning methodology to combine prediction and planning. A conditional GAN with the U-Net architecture is trained to predict two high-resolution image sequences. The sequences represent explicit motion predictions, mainly used to train context understanding, and pixel state values suitable for planning encoding kinematic reachability, object dynamics, safety, and driving comfort. The model can be trained offline on target images rendered by a sampling-based model-predictive planner, leveraging real-world driving data. Our results demonstrate intuitive behavior in complex situations, such as lane changes amidst conflicting objectives.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (25)
  1. P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros, “Image-to-Image Translation with Conditional Adversarial Networks,” arXiv:1611.07004 [cs], Nov. 2018.
  2. W. Luo, C. Park, A. Cornman, B. Sapp, and D. Anguelov, “JFP: Joint future prediction with interactive multi-agent modeling for autonomous driving,” in Conference on Robot Learning (CoRL).   PMLR, 2023, pp. 1457–1467.
  3. B. Varadarajan, A. Hefny, A. Srivastava, K. S. Refaat, N. Nayakanti, A. Cornman, K. Chen, B. Douillard, C. P. Lam, D. Anguelov et al., “Multipath++: Efficient information fusion and trajectory aggregation for behavior prediction,” in 2022 International Conference on Robotics and Automation (ICRA), 2022, pp. 7814–7821.
  4. S. Shi, L. Jiang, D. Dai, and B. Schiele, “MTR-A: 1st place solution for 2022 Waymo open dataset challenge–motion prediction,” arXiv preprint arXiv:2209.10033, 2022.
  5. T. Gilles, S. Sabatini, D. Tsishkou, B. Stanciulescu, and F. Moutarde, “Thomas: Trajectory heatmap output with learned multi-agent sampling,” arXiv preprint arXiv:2110.06607, 2021.
  6. M.-F. Chang, J. Lambert, P. Sangkloy, J. Singh, S. Bak, A. Hartnett, D. Wang, P. Carr, S. Lucey, D. Ramanan et al., “Argoverse: 3d tracking and forecasting with rich maps,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 8748–8757.
  7. B. Wilson, W. Qi, T. Agarwal, J. Lambert, J. Singh, S. Khandelwal, B. Pan, R. Kumar, A. Hartnett, J. K. Pontes et al., “Argoverse 2: Next generation datasets for self-driving perception and forecasting,” arXiv preprint arXiv:2301.00493, 2023.
  8. S. Ettinger, S. Cheng, B. Caine, C. Liu, H. Zhao, S. Pradhan, Y. Chai, B. Sapp, C. R. Qi, Y. Zhou et al., “Large scale interactive motion forecasting for autonomous driving: The Waymo open motion dataset,” in Proceedings of the IEEE/CVF International Conference on Computer Vision (CVPR), 2021, pp. 9710–9719.
  9. W. Zhan, L. Sun, D. Wang, H. Shi, A. Clausse, M. Naumann, J. Kummerle, H. Konigshof, C. Stiller, A. de La Fortelle et al., “Interaction dataset: An international, adversarial and cooperative motion dataset in interactive driving scenarios with semantic maps,” arXiv preprint arXiv:1910.03088, 2019.
  10. M. Wulfmeier, D. Rao, D. Z. Wang, P. Ondruska, and I. Posner, “Large-scale cost function learning for path planning using deep inverse reinforcement learning,” IJRR, vol. 36, no. 10, pp. 1073–1087, Sep. 2017.
  11. K. Lee, D. Isele, E. A. Theodorou, and S. Bae, “Spatiotemporal costmap inference for MPC via deep inverse reinforcement learning,” IEEE Robotics and Automation Letters, vol. 7, no. 2, pp. 3194–3201, 2022.
  12. M. Wulfmeier, D. Rao, and I. Posner, “Incorporating human domain knowledge into large scale cost function learning,” arXiv preprint arXiv:1612.04318, 2016.
  13. C. Grimm, A. Barreto, S. Singh, and D. Silver, “The value equivalence principle for model-based reinforcement learning,” Advances in Neural Information Processing Systems (NeurIPS), vol. 33, pp. 5541–5552, 2020.
  14. C. Grimm, A. Barreto, G. Farquhar, D. Silver, and S. Singh, “Proper value equivalence,” Advances in Neural Information Processing Systems, vol. 34, pp. 7773–7786, 2021.
  15. J. Oh, S. Singh, and H. Lee, “Value prediction network,” Advances in Neural Information Processing Systems (NeurIPS), vol. 30, 2017.
  16. J. Schrittwieser, I. Antonoglou, T. Hubert, K. Simonyan, L. Sifre, S. Schmitt, A. Guez, E. Lockhart, D. Hassabis, T. Graepel et al., “Mastering Atari, Go, chess and Shogi by planning with a learned model,” Nature, vol. 588, no. 7839, pp. 604–609, 2020.
  17. S. Rosbach, V. James, S. Großjohann, S. Homoceanu, and S. Roth, “Driving with Style: Inverse Reinforcement Learning in General-Purpose Planning for Automated Driving,” 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 2658–2665, Nov. 2019.
  18. W. Zeng, W. Luo, S. Suo, A. Sadat, B. Yang, S. Casas, and R. Urtasun, “End-to-end interpretable neural motion planner,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 8660–8669.
  19. W. Zeng, S. Wang, R. Liao, Y. Chen, B. Yang, and R. Urtasun, “DSDNet: Deep structured self-driving network,” in Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXI 16.   Springer, 2020, pp. 156–172.
  20. O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” in Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18.   Springer, 2015, pp. 234–241.
  21. S. Rosbach, V. James, S. Großjohann, S. Homoceanu, X. Li, and S. Roth, “Driving style encoder: Situational reward adaptation for general-purpose planning in automated driving,” in 2020 IEEE International Conference on Robotics and Automation (ICRA).   Paris, France: IEEE, May 2020, pp. 6419–6425.
  22. S. Rosbach, X. Li, S. Großjohann, S. Homoceanu, and S. Roth, “Planning on the fast lane: Learning to interact using attention mechanisms in path integral inverse reinforcement learning,” in 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).   IEEE, 2020, pp. 5187–5193.
  23. B. D. Ziebart, A. L. Maas, J. Bagnell, and A. Dey, “Maximum entropy inverse reinforcement learning,” in AAAI, 2008.
  24. J. E. Bresenham, “Algorithm for computer control of a digital plotter,” IBM Systems Journal, vol. 4, no. 1, pp. 25–30, 1965.
  25. A. Y. Ng, D. Harada, and S. Russell, “Policy invariance under reward transformations: Theory and application to reward shaping,” in Icml, vol. 99, 1999, pp. 278–287.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Sascha Rosbach (4 papers)
  2. Stefan M. Leupold (1 paper)
  3. Simon Großjohann (4 papers)
  4. Stefan Roth (97 papers)

Summary

We haven't generated a summary for this paper yet.