Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
133 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

QuAD: Query-based Interpretable Neural Motion Planning for Autonomous Driving (2404.01486v1)

Published 1 Apr 2024 in cs.RO, cs.AI, cs.CV, and cs.LG

Abstract: A self-driving vehicle must understand its environment to determine the appropriate action. Traditional autonomy systems rely on object detection to find the agents in the scene. However, object detection assumes a discrete set of objects and loses information about uncertainty, so any errors compound when predicting the future behavior of those agents. Alternatively, dense occupancy grid maps have been utilized to understand free-space. However, predicting a grid for the entire scene is wasteful since only certain spatio-temporal regions are reachable and relevant to the self-driving vehicle. We present a unified, interpretable, and efficient autonomy framework that moves away from cascading modules that first perceive, then predict, and finally plan. Instead, we shift the paradigm to have the planner query occupancy at relevant spatio-temporal points, restricting the computation to those regions of interest. Exploiting this representation, we evaluate candidate trajectories around key factors such as collision avoidance, comfort, and progress for safety and interpretability. Our approach achieves better highway driving quality than the state-of-the-art in high-fidelity closed-loop simulations.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (47)
  1. Implicit occupancy flow fields for perception and prediction in self-driving. CVPR, 2023.
  2. End to end learning for self-driving cars. arXiv preprint, 2016.
  3. Implicit latent variable model for scene-consistent motion forecasting. In ECCV, 2020.
  4. Intentnet: Learning to predict intention from raw sensor data. In CoRL, 2018.
  5. Mp3: A unified model to map, perceive, predict and plan. In CVPR, 2021.
  6. Multipath: Multiple probabilistic anchor trajectory hypotheses for behavior prediction. CoRL, 2019.
  7. End-to-end driving via conditional imitation learning. In ICRA, 2018.
  8. Exploring the limitations of behavior cloning for autonomous driving. In ICCV, 2019.
  9. Lookout: Diverse multi-future prediction and planning for self-driving. In ICCV, 2021.
  10. An auto-tuning framework for autonomous vehicles. arXiv preprint, 2018.
  11. Baidu apollo em motion planner. arXiv preprint, 2018.
  12. Urban driving with conditional imitation learning. arXiv preprint, 2019.
  13. Dynamic occupancy grid prediction for urban autonomous driving: A deep learning approach with fully automatic labeling. In ICRA, 2018.
  14. Model-based imitation learning for urban driving. arXiv preprint arXiv:2210.07729, 2022.
  15. Fiery: Future instance prediction in bird’s-eye view from surround monocular cameras. In ICCV, 2021.
  16. Planning-oriented autonomous driving. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023.
  17. General lane-changing model mobil for car-following models. Transportation Research Record, 1999(1), 2007.
  18. Kinematic and dynamic vehicle models for autonomous driving control design. In 2015 IEEE intelligent vehicles symposium (IV), pages 1094–1099. IEEE, 2015.
  19. Pointpillars: Fast encoders for object detection from point clouds. In CVPR, 2019.
  20. Pnpnet: End-to-end perception and prediction with tracking in the loop. In CVPR (CVPR), June 2020.
  21. Feature pyramid networks for object detection. In CVPR, pages 2117–2125, 2017.
  22. Fast and furious: Real time end-to-end 3d detection, tracking and motion forecasting with a single convolutional net. In CVPR, 2018.
  23. Occupancy flow fields for motion forecasting in autonomous driving. IEEE Robotics and Automation Letters, 2022.
  24. Lidarsim: Realistic lidar simulation by leveraging the real world. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11167–11176, 2020.
  25. Driving policy transfer via modularity and abstraction. arXiv preprint, 2018.
  26. Covernet: Multimodal behavior prediction using trajectory sets. arXiv preprint, 2019.
  27. Lift, splat, shoot: Encoding images from arbitrary camera rigs by implicitly unprojecting to 3d. In ECCV, 2020.
  28. Dean A Pomerleau. Alvinn: An autonomous land vehicle in a neural network. In NIPS, 1989.
  29. Plant: Explainable planning transformers via object-level representations. arXiv preprint arXiv:2210.14222, 2022.
  30. Contingencies from observations: Tractable contingency planning with learned behavior models. In 2021 IEEE International Conference on Robotics and Automation (ICRA), pages 13663–13669. IEEE, 2021.
  31. Deep imitative models for flexible inference, planning, and control. arXiv preprint, 2018.
  32. A reduction of imitation learning and structured prediction to no-regret online learning. In Proceedings of the fourteenth international conference on artificial intelligence and statistics, pages 627–635. JMLR Workshop and Conference Proceedings, 2011.
  33. Perceive, predict, and plan: Safe motion planning through interpretable semantic representations. In ECCV, 2020.
  34. Jointly learnable behavior and trajectory planning for self-driving vehicles. In IROS, 2018.
  35. Beyond pixels: Leveraging geometry and shape cues for online multi-object tracking. In ICRA, 2018.
  36. Multiple futures prediction. In Advances in Neural Information Processing Systems, 2019.
  37. Drowned out by the noise: Evidence for tracking-free motion prediction. arXiv preprint, 2021.
  38. Congested traffic states in empirical observations and microscopic simulations. Physical Review E, 62(2), aug 2000.
  39. 3d multi-object tracking: A baseline and new evaluation metrics. In 2020 IEEE/RSJ IROS, 2020.
  40. Optimal trajectory generation for dynamic street scenarios in a frenet frame. In ICRA, 2010.
  41. Hdnet: Exploiting hd maps for 3d object detection. In CoRL, 2018.
  42. Pixor: Real-time 3d object detection from point clouds. In CVPR, 2018.
  43. End-to-end interpretable neural motion planner. In CVPR, 2019.
  44. Dsdnet: Deep structured self-driving network. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXI 16, pages 156–172. Springer, 2020.
  45. Rethinking closed-loop training for autonomous driving. In ECCV 2022.
  46. Multi-agent tensor fusion for contextual trajectory prediction. In CVPR, 2019.
  47. Deformable detr: Deformable transformers for end-to-end object detection. arXiv, 2020.
Citations (5)

Summary

We haven't generated a summary for this paper yet.