Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

End-to-end Autonomous Driving: Challenges and Frontiers (2306.16927v3)

Published 29 Jun 2023 in cs.RO, cs.AI, cs.CV, and cs.LG

Abstract: The autonomous driving community has witnessed a rapid growth in approaches that embrace an end-to-end algorithm framework, utilizing raw sensor input to generate vehicle motion plans, instead of concentrating on individual tasks such as detection and motion prediction. End-to-end systems, in comparison to modular pipelines, benefit from joint feature optimization for perception and planning. This field has flourished due to the availability of large-scale datasets, closed-loop evaluation, and the increasing need for autonomous driving algorithms to perform effectively in challenging scenarios. In this survey, we provide a comprehensive analysis of more than 270 papers, covering the motivation, roadmap, methodology, challenges, and future trends in end-to-end autonomous driving. We delve into several critical challenges, including multi-modality, interpretability, causal confusion, robustness, and world models, amongst others. Additionally, we discuss current advancements in foundation models and visual pre-training, as well as how to incorporate these techniques within the end-to-end driving framework. we maintain an active repository that contains up-to-date literature and open-source projects at https://github.com/OpenDriveLab/End-to-end-Autonomous-Driving.

End-to-end Autonomous Driving: Challenges and Frontiers

The paper "End-to-end Autonomous Driving: Challenges and Frontiers" presents a comprehensive analysis of the emerging domain of end-to-end autonomous driving systems. These systems integrate raw sensor inputs to generate driving actions, contrasting with traditional modular architectures that separately handle perception, prediction, and planning tasks. This survey explores methodologies, challenges, and potential developments in designing these complex systems by reviewing over 250 related papers.

Key Methodologies

The authors focus on categorizing end-to-end driving approaches into two primary learning paradigms: imitation learning (IL) and reinforcement learning (RL). IL methods, such as behavior cloning, model the driving policy based on expert demonstrations. Despite its simplicity, IL faces issues like covariate shift and causal confusion. On the other hand, RL techniques, though data-intensive, offer the flexibility of learning through interaction with the environment. Interestingly, hybrid models that combine IL with RL are emerging, leveraging prior knowledge to expedite policy refinement.

Benchmarks and Evaluation

The paper emphasizes the significance of benchmarking through both simulation (closed-loop) and real-world datasets (open-loop). Simulators offer controlled environments for testing diverse scenarios but may not capture real-world variability. The use of rich datasets from sources like nuScenes and Waymo supplements this by providing real-world driving data for evaluating system robustness.

Challenges

The paper provides an in-depth discussion on several critical challenges:

  • Multi-modal Fusion: Integrating data from heterogeneous sensors like cameras and LiDARs remains complex. Effective feature extraction across modalities is necessary for improving planning accuracy.
  • Robustness and Generalization: Issues like domain adaptation and covariate shift are pivotal. Systems must be trained to generalize across different driving environments, weather conditions, and geographic regions.
  • Causal Confusion: Challenges arise when models rely on spurious correlations, such as past vehicle speed, leading to incorrect predictions. Addressing this requires innovative architectural solutions.
  • Interpretability: There’s a need for enhancing the transparency of these systems, particularly through attention mechanisms, auxiliary tasks, and cost learning to provide insight into decision-making processes.

Implications and Future Developments

The paper outlines potential advancements that hold promise for the future of autonomous driving:

  • Large Foundation Models: Inspiration can be drawn from developments in large-scale LLMs to improve the capabilities of driving models.
  • Data-driven Simulation and Synthesis: Enhancing the realism and diversity of simulation environments will be crucial for training robust systems.
  • Zero-shot and Few-shot Learning: This encourages models to adapt to unseen scenarios with limited data, crucial for real-world deployment.
  • Vehicle-to-Everything (V2X) Communication: Incorporating V2X data can enhance situational awareness and decision-making, especially in complex traffic scenarios.

Conclusion

The paper emphasizes that while significant strides have been made in end-to-end autonomous driving, many challenges remain. Collaborative research efforts are needed to address these challenges through innovative algorithms, high-quality datasets, and comprehensive simulation environments. This survey acts as a guiding document for researchers looking to further the field of autonomous driving systems, paving the way for advancements toward safer and more reliable self-driving vehicles.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (261)
  1. S. Casas, A. Sadat, and R. Urtasun, “Mp3: A unified model to map, perceive, predict and plan,” in CVPR, 2021.
  2. Y. Hu, J. Yang, L. Chen, K. Li, C. Sima, X. Zhu, S. Chai, S. Du, T. Lin, W. Wang, L. Lu, X. Jia, Q. Liu, J. Dai, Y. Qiao, and H. Li, “Planning-oriented autonomous driving,” in CVPR, 2023.
  3. D. A. Pomerleau, “Alvinn: An autonomous land vehicle in a neural network,” in NeurIPS, 1988.
  4. A. Kendall, J. Hawke, D. Janz, P. Mazur, D. Reda, J.-M. Allen, V.-D. Lam, A. Bewley, and A. Shah, “Learning to drive in a day,” in ICRA, 2019.
  5. D. Chen, B. Zhou, V. Koltun, and P. Krähenbühl, “Learning by cheating,” in CoRL, 2020.
  6. A. Prakash, K. Chitta, and A. Geiger, “Multi-modal fusion transformer for end-to-end autonomous driving,” in CVPR, 2021.
  7. N. Hanselmann, K. Renz, K. Chitta, A. Bhattacharyya, and A. Geiger, “King: Generating safety-critical driving scenarios for robust imitation via kinematics gradients,” in ECCV, 2022.
  8. M. Bojarski, D. Del Testa, D. Dworakowski, B. Firner, B. Flepp, P. Goyal, L. D. Jackel, M. Monfort, U. Muller, J. Zhang, et al., “End to end learning for self-driving cars,” arXiv.org, vol. 1604.07316, 2016.
  9. F. Codevilla, M. Müller, A. López, V. Koltun, and A. Dosovitskiy, “End-to-end driving via conditional imitation learning,” in ICRA, 2018.
  10. A. Prakash, A. Behl, E. Ohn-Bar, K. Chitta, and A. Geiger, “Exploring data aggregation in policy learning for vision-based urban autonomous driving,” in CVPR, 2020.
  11. K. Chitta, A. Prakash, and A. Geiger, “Neat: Neural attention fields for end-to-end autonomous driving,” in ICCV, 2021.
  12. P. Wu, L. Chen, H. Li, X. Jia, J. Yan, and Y. Qiao, “Policy pre-training for autonomous driving via self-supervised geometric modeling,” in ICLR, 2023.
  13. CARLA, “CARLA autonomous driving leaderboard.” https://leaderboard.carla.org/, 2022.
  14. H. Caesar, J. Kabzan, K. S. Tan, W. K. Fong, E. Wolff, A. Lang, L. Fletcher, O. Beijbom, and S. Omari, “Nuplan: A closed-loop ml-based planning benchmark for autonomous vehicles,” in CVPR Workshops, 2021.
  15. J. Hawke, R. Shen, C. Gurau, S. Sharma, D. Reda, N. Nikolov, P. Mazur, S. Micklethwaite, N. Griffiths, A. Shah, et al., “Urban driving with conditional imitation learning,” in ICRA, 2020.
  16. F. Codevilla, E. Santana, A. M. López, and A. Gaidon, “Exploring the limitations of behavior cloning for autonomous driving,” in ICCV, 2019.
  17. X. Liang, T. Wang, L. Yang, and E. Xing, “Cirl: Controllable imitative reinforcement learning for vision-based self-driving,” in ECCV, 2018.
  18. M. Toromanoff, E. Wirbel, and F. Moutarde, “End-to-end model-free reinforcement learning for urban driving using implicit affordances,” in CVPR, 2020.
  19. R. Chekroun, M. Toromanoff, S. Hornauer, and F. Moutarde, “Gri: General reinforced imitation and its application to vision-based autonomous driving,” arXiv.org, vol. 2111.08575, 2021.
  20. D. Chen, V. Koltun, and P. Krähenbühl, “Learning to drive from a world on rails,” in ICCV, 2021.
  21. Z. Zhang, A. Liniger, D. Dai, F. Yu, and L. Van Gool, “End-to-end urban driving by imitating a reinforcement learning coach,” in ICCV, 2021.
  22. P. Wu, X. Jia, L. Chen, J. Yan, H. Li, and Y. Qiao, “Trajectory-guided control prediction for end-to-end autonomous driving: A simple yet strong baseline,” in NeurIPS, 2022.
  23. J. Zhang, Z. Huang, and E. Ohn-Bar, “Coaching a teachable student,” in CVPR, 2023.
  24. Y. Pan, C.-A. Cheng, K. Saigol, K. Lee, X. Yan, E. A. Theodorou, and B. Boots, “Agile autonomous driving using end-to-end deep imitation learning,” in RSS, 2017.
  25. J. Zhang and K. Cho, “Query-efficient imitation learning for end-to-end simulated driving,” in AAAI, 2017.
  26. S. Ross, G. Gordon, and D. Bagnell, “A reduction of imitation learning and structured prediction to no-regret online learning,” in AISTATS, 2011.
  27. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” in NeurIPS, 2017.
  28. K. Chitta, A. Prakash, B. Jaeger, Z. Yu, K. Renz, and A. Geiger, “Transfuser: Imitation with transformer-based sensor fusion for autonomous driving,” PAMI, 2022.
  29. H. Shao, L. Wang, R. Chen, H. Li, and Y. Liu, “Safety-enhanced autonomous driving using interpretable sensor fusion transformer,” in CoRL, 2022.
  30. X. Jia, P. Wu, L. Chen, J. Xie, C. He, J. Yan, and H. Li, “Think twice before driving: Towards scalable decoders for end-to-end autonomous driving,” in CVPR, 2023.
  31. B. Jaeger, K. Chitta, and A. Geiger, “Hidden biases of end-to-end driving models,” arXiv.org, vol. 2306.07957, 2023.
  32. W. Zeng, W. Luo, S. Suo, A. Sadat, B. Yang, S. Casas, and R. Urtasun, “End-to-end interpretable neural motion planner,” in CVPR, 2019.
  33. J. Kim, A. Rohrbach, T. Darrell, J. Canny, and Z. Akata, “Textual explanations for self-driving vehicles,” in ECCV, 2018.
  34. J. Wang, A. Pun, J. Tu, S. Manivasagam, A. Sadat, S. Casas, M. Ren, and R. Urtasun, “Advsim: Generating safety-critical scenarios for self-driving vehicles,” in CVPR, 2021.
  35. W. Ding, B. Chen, M. Xu, and D. Zhao, “Learning to collide: An adaptive safety-critical scenarios generating method,” in IROS, 2020.
  36. Q. Zhang, Z. Peng, and B. Zhou, “Learning to drive by watching youtube videos: Action-conditioned contrastive policy pretraining,” in ECCV, 2022.
  37. J. Zhang, R. Zhu, and E. Ohn-Bar, “Selfd: Self-learning large-scale driving policies from the web,” in CVPR, 2022.
  38. S. Hu, L. Chen, P. Wu, H. Li, J. Yan, and D. Tao, “St-p3: End-to-end vision-based autonomous driving via spatial-temporal feature learning,” in ECCV, 2022.
  39. A. Sadat, S. Casas, M. Ren, X. Wu, P. Dhawan, and R. Urtasun, “Perceive, predict, and plan: Safe motion planning through interpretable semantic representations,” in ECCV, 2020.
  40. J. Janai, F. Güney, A. Behl, and A. Geiger, “Computer vision for autonomous vehicles: Problems, datasets and state-of-the-art,” arXiv.org, vol. 1704.05519, 2017.
  41. A. Tampuu, T. Matiisen, M. Semikin, D. Fishman, and N. Muhammad, “A survey of end-to-end driving: Architectures and training methods,” TNNLS, 2020.
  42. S. Teng, X. Hu, P. Deng, B. Li, Y. Li, D. Yang, Y. Ai, L. Li, L. Chen, Z. Xuanyuan, et al., “Motion planning for autonomous driving: The state of the art and future perspectives,” arXiv.org, vol. 2303.09824, 2023.
  43. D. Coelho and M. Oliveira, “A review of end-to-end autonomous driving in urban environments,” IEEE Access, 2022.
  44. A. O. Ly and M. Akhloufi, “Learning to drive by imitation: An overview of deep behavior cloning methods,” TIV, 2020.
  45. L. Le Mero, D. Yi, M. Dianati, and A. Mouzakitis, “A survey on imitation learning techniques for end-to-end autonomous vehicles,” TITS, 2022.
  46. B. Zheng, S. Verma, J. Zhou, I. W. Tsang, and F. Chen, “Imitation learning: Progress, taxonomies and challenges,” TNNLS, 2022.
  47. Z. Zhu and H. Zhao, “A survey of deep RL and IL for autonomous driving policy learning,” TITS, 2021.
  48. B. R. Kiran, I. Sobh, V. Talpaert, P. Mannion, A. A. A. Sallab, S. K. Yogamani, and P. Pérez, “Deep reinforcement learning for autonomous driving: A survey,” TITS, 2021.
  49. M. Bain and C. Sammut, “A framework for behavioural cloning,” in Machine Intelligence 15, 1995.
  50. B. D. Ziebart, A. L. Maas, J. A. Bagnell, A. K. Dey, et al., “Maximum entropy inverse reinforcement learning,” in AAAI, 2008.
  51. Y. Lecun, E. Cosatto, J. Ben, U. Muller, and B. Flepp, “Dave: Autonomous off-road vehicle control using end-to-end learning,” Tech. Rep. DARPA-IPTO Final Report, Courant Institute/CBLL, 2004.
  52. D. Chen and P. Krähenbühl, “Learning from all vehicles,” in CVPR, 2022.
  53. K. Judah, A. P. Fern, T. G. Dietterich, and P. Tadepalli, “Active imitation learning: Formal and practical reductions to iid learning,” JMLR, 2014.
  54. S. Ross and D. Bagnell, “Efficient reductions for imitation learning,” in AISTATS, 2010.
  55. S. Ross and J. A. Bagnell, “Reinforcement and imitation learning via interactive no-regret learning,” arXiv.org, vol. 1406.5979, 2014.
  56. A. E. Sallab, M. Saeed, O. A. Tawab, and M. Abdou, “Meta learning framework for automated driving,” arXiv.org, vol. 1706.04038, 2017.
  57. C. Wen, J. Lin, T. Darrell, D. Jayaraman, and Y. Gao, “Fighting copycat agents in behavioral cloning from observation histories,” in NIPS, 2020.
  58. C. Wen, J. Lin, J. Qian, Y. Gao, and D. Jayaraman, “Keyframe-focused visual imitation learning,” in ICML, 2021.
  59. J. Park, Y. Seo, C. Liu, L. Zhao, T. Qin, J. Shin, and T.-Y. Liu, “Object-aware regularization for addressing causal confusion in imitation learning,” in NeurIPS, 2021.
  60. C. Wen, J. Qian, J. Lin, J. Teng, D. Jayaraman, and Y. Gao, “Fighting fire with fire: avoiding dnn shortcuts through priming,” in ICML, 2022.
  61. D. Brown, W. Goo, P. Nagarajan, and S. Niekum, “Extrapolating beyond suboptimal demonstrations via inverse reinforcement learning from observations,” in ICML, 2019.
  62. C. Finn, S. Levine, and P. Abbeel, “Guided cost learning: Deep inverse optimal control via policy optimization,” in ICML, 2016.
  63. S. Reddy, A. D. Dragan, and S. Levine, “Sqil: Imitation learning via reinforcement learning with sparse rewards,” arXiv.org, vol. 1905.11108, 2019.
  64. S. Luo, H. Kasaei, and L. Schomaker, “Self-imitation learning by planning,” in ICRA, 2021.
  65. J. Ho and S. Ermon, “Generative adversarial imitation learning,” in NeurIPS, 2016.
  66. Y. Li, J. Song, and S. Ermon, “Infogail: Interpretable imitation learning from visual demonstrations,” in NeurIPS, 2017.
  67. G. Lee, D. Kim, W. Oh, K. Lee, and S. Oh, “Mixgail: Autonomous driving using demonstrations with mixed qualities,” in IROS, 2020.
  68. I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial networks,” ACM, 2020.
  69. H. Wang, P. Cai, R. Fan, Y. Sun, and M. Liu, “End-to-end interactive prediction and planning with optical flow distillation for autonomous driving,” in CVPR Workshops, 2021.
  70. P. Hu, A. Huang, J. Dolan, D. Held, and D. Ramanan, “Safe local motion planning with self-supervised freespace forecasting,” in CVPR, 2021.
  71. T. Khurana, P. Hu, A. Dave, J. Ziglar, D. Held, and D. Ramanan, “Differentiable raycasting for self-supervised occupancy forecasting,” in ECCV, 2022.
  72. H. Wang, P. Cai, Y. Sun, L. Wang, and M. Liu, “Learning interpretable end-to-end vision-based motion planning for autonomous driving with optical flow distillation,” in ICRA, 2021.
  73. R. S. Sutton and A. G. Barto, “Reinforcement learning: An introduction,” TNNLS, 1998.
  74. V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski, et al., “Human-level control through deep reinforcement learning,” Nature, 2015.
  75. M. G. Bellemare, Y. Naddaf, J. Veness, and M. Bowling, “The arcade learning environment: An evaluation platform for general agents,” JAIR, 2013.
  76. D. Horgan, J. Quan, D. Budden, G. Barth-Maron, M. Hessel, H. Van Hasselt, and D. Silver, “Distributed prioritized experience replay,” arXiv.org, vol. 1803.00933, 2018.
  77. A. Dosovitskiy, G. Ros, F. Codevilla, A. Lopez, and V. Koltun, “CARLA: An open urban driving simulator,” in CoRL, 2017.
  78. J. Bjorck, C. P. Gomes, and K. Q. Weinberger, “Towards deeper deep reinforcement learning with spectral normalization,” in NeurIPS, 2021.
  79. M. Toromanoff, E. Wirbel, and F. Moutarde, “Is deep reinforcement learning really superhuman on atari? leveling the playing field,” arXiv.org, vol. 1908.04683, 2019.
  80. E. Ohn-Bar, A. Prakash, A. Behl, K. Chitta, and A. Geiger, “Learning situational driving,” in CVPR, 2020.
  81. W. B. Knox, A. Allievi, H. Banzhaf, F. Schmitt, and P. Stone, “Reward (mis)design for autonomous driving,” arXiv.org, vol. 2104.13906, 2021.
  82. C. Zhang, R. Guo, W. Zeng, Y. Xiong, B. Dai, R. Hu, M. Ren, and R. Urtasun, “Rethinking closed-loop training for autonomous driving,” in ECCV, 2022.
  83. D. Hafner, T. Lillicrap, J. Ba, and M. Norouzi, “Dream to control: Learning behaviors by latent imagination,” in ICLR, 2020.
  84. D. Hafner, T. Lillicrap, M. Norouzi, and J. Ba, “Mastering atari with discrete world models,” in ICLR, 2021.
  85. D. Ha and J. Schmidhuber, “Recurrent world models facilitate policy evolution,” in NeurIPS, 2018.
  86. Y. Abeysirigoonawardena, F. Shkurti, and G. Dudek, “Generating adversarial driving scenarios in high-fidelity simulators,” in ICRA, 2019.
  87. Q. Wang, Z. Wang, K. Genova, P. P. Srinivasan, H. Zhou, J. T. Barron, R. Martin-Brualla, N. Snavely, and T. A. Funkhouser, “Ibrnet: Learning multi-view image-based rendering,” in CVPR, 2021.
  88. B. Wymann, E. Espié, C. Guionneau, C. Dimitrakakis, R. Coulom, and A. Sumner, “Torcs, the open racing car simulator,” Software available at http://torcs. sourceforge. net, vol. 4, no. 6, p. 2, 2000.
  89. M. Martinez, C. Sitawarin, K. Finch, L. Meincke, A. Yablonski, and A. Kornhauser, “Beyond grand theft auto v for training, testing and enhancing deep learning in self driving cars,” arXiv.org, vol. 1712.01397, 2017.
  90. P. A. Lopez, M. Behrisch, L. Bieker-Walz, J. Erdmann, Y.-P. Flötteröd, R. Hilbrich, L. Lücken, J. Rummel, P. Wagner, and E. Wießner, “Microscopic traffic simulation using sumo,” in ITSC, 2018.
  91. D. Team, “Deepdrive: a simulator that allows anyone with a pc to push the state-of-the-art in self-driving.” https://github.com/deepdrive/deepdrive, 2020.
  92. Q. Li, Z. Peng, L. Feng, Q. Zhang, Z. Xue, and B. Zhou, “Metadrive: Composing diverse driving scenarios for generalizable reinforcement learning,” PAMI, 2022.
  93. D. J. Fremont, T. Dreossi, S. Ghosh, X. Yue, A. L. Sangiovanni-Vincentelli, and S. A. Seshia, “Scenic: a language for scenario specification and scene generation,” in PLDI, 2019.
  94. F. Hauer, T. Schmidt, B. Holzmüller, and A. Pretschner, “Did we test all scenarios for automated and autonomous driving systems?,” in ITSC, 2019.
  95. L. Bergamini, Y. Ye, O. Scheel, L. Chen, C. Hu, L. D. Pero, B. Osinski, H. Grimmett, and P. Ondruska, “Simnet: Learning reactive self-driving simulations from real-world observations,” in ICRA, 2021.
  96. L. Mi, H. Zhao, C. Nash, X. Jin, J. Gao, C. Sun, C. Schmid, N. Shavit, Y. Chai, and D. Anguelov, “Hdmapgen: A hierarchical graph generative model of high definition maps,” in CVPR, 2021.
  97. L. Feng, Q. Li, Z. Peng, S. Tan, and B. Zhou, “Trafficgen: Learning to generate diverse and realistic traffic scenarios,” in ICRA, 2023.
  98. S. Tan, K. Wong, S. Wang, S. Manivasagam, M. Ren, and R. Urtasun, “Scenegen: Learning to generate realistic traffic scenes,” in CVPR, 2021.
  99. S. Suo, S. Regalado, S. Casas, and R. Urtasun, “Trafficsim: Learning to simulate realistic multi-agent behaviors,” in CVPR, 2021.
  100. M. Treiber, A. Hennecke, and D. Helbing, “Congested traffic states in empirical observations and microscopic simulations,” Physical review E, 2000.
  101. Z. Zhong, D. Rempe, D. Xu, Y. Chen, S. Veer, T. Che, B. Ray, and M. Pavone, “Guided conditional diffusion for controllable traffic simulation,” in ICRA, 2023.
  102. D. Xu, Y. Chen, B. Ivanovic, and M. Pavone, “Bits: Bi-level imitation for traffic simulation,” in ICRA, 2023.
  103. Z. Zhang, A. Liniger, D. Dai, F. Yu, and L. Van Gool, “TrafficBots: Towards world models for autonomous driving simulation and motion prediction,” in ICRA, 2023.
  104. S. Manivasagam, S. Wang, K. Wong, W. Zeng, M. Sazanovich, S. Tan, B. Yang, W. Ma, and R. Urtasun, “Lidarsim: Realistic lidar simulation by leveraging the real world,” in CVPR, 2020.
  105. Z. Yang, Y. Chai, D. Anguelov, Y. Zhou, P. Sun, D. Erhan, S. Rafferty, and H. Kretzschmar, “Surfelgan: Synthesizing realistic sensor data for autonomous driving,” in CVPR, 2020.
  106. Y. Chen, F. Rong, S. Duggal, S. Wang, X. Yan, S. Manivasagam, S. Xue, E. Yumer, and R. Urtasun, “Geosim: Realistic video simulation via geometry-aware composition for self-driving,” in CVPR, 2021.
  107. Z. Yang, Y. Chen, J. Wang, S. Manivasagam, W.-C. Ma, A. J. Yang, and R. Urtasun, “Unisim: A neural closed-loop sensor simulator,” in CVPR, 2023.
  108. A. Petrenko, E. Wijmans, B. Shacklett, and V. Koltun, “Megaverse: Simulating embodied agents at one million experiences per second,” in ICML, 2021.
  109. Z. Song, Z. He, X. Li, Q. Ma, R. Ming, Z. Mao, H. Pei, L. Peng, J. Hu, D. Yao, et al., “Synthetic datasets for autonomous driving: A survey,” arXiv.org, vol. 2304.12205, 2023.
  110. A. Amini, I. Gilitschenski, J. Phillips, J. Moseyko, R. Banerjee, S. Karaman, and D. Rus, “Learning robust control policies for end-to-end autonomous driving from data-driven simulation,” RA-L, 2020.
  111. A. Amini, T.-H. Wang, I. Gilitschenski, W. Schwarting, Z. Liu, S. Han, S. Karaman, and D. Rus, “Vista 2.0: An open, data-driven simulator for multimodal sensing and policy learning for autonomous vehicles,” in ICRA, 2022.
  112. T.-H. Wang, A. Amini, W. Schwarting, I. Gilitschenski, S. Karaman, and D. Rus, “Learning interactive driving policies via data-driven simulation,” in ICRA, 2022.
  113. B. Mildenhall, P. P. Srinivasan, M. Tancik, J. T. Barron, R. Ramamoorthi, and R. Ng, “Nerf: Representing scenes as neural radiance fields for view synthesis,” in ECCV, 2020.
  114. H. Turki, D. Ramanan, and M. Satyanarayanan, “Mega-nerf: Scalable construction of large-scale nerfs for virtual fly-throughs,” in CVPR, 2022.
  115. A. Kundu, K. Genova, X. Yin, A. Fathi, C. Pantofaru, L. Guibas, A. Tagliasacchi, F. Dellaert, and T. Funkhouser, “Panoptic Neural Fields: A Semantic Object-Aware Neural Scene Representation,” in CVPR, 2022.
  116. Y. Yang, Y. Yang, H. Guo, R. Xiong, Y. Wang, and Y. Liao, “Urbangiraffe: Representing urban scenes as compositional generative neural feature fields,” arXiv.org, vol. 2303.14167, 2023.
  117. S. R. Richter, H. A. Alhaija, and V. Koltun, “Enhancing photorealism enhancement,” PAMI, 2023.
  118. F. Codevilla, A. M. Lopez, V. Koltun, and A. Dosovitskiy, “On offline evaluation of vision-based driving models,” in ECCV, 2018.
  119. D. Dauner, M. Hallgarten, A. Geiger, and K. Chitta, “Parting with misconceptions about learning-based vehicle motion planning,” arXiv.org, vol. 2306.07962, 2023.
  120. J.-T. Zhai, Z. Feng, J. Du, Y. Mao, J.-J. Liu, Z. Tan, Y. Zhang, X. Ye, and J. Wang, “Rethinking the open-loop evaluation of end-to-end autonomous driving in nuscenes,” arXiv.org, vol. 2305.10430, 2023.
  121. H. Caesar, V. Bankiti, A. H. Lang, S. Vora, V. E. Liong, Q. Xu, A. Krishnan, Y. Pan, G. Baldan, and O. Beijbom, “nuscenes: A multimodal dataset for autonomous driving,” in CVPR, 2020.
  122. M. Chang, J. Lambert, P. Sangkloy, J. Singh, S. Bak, A. Hartnett, D. Wang, P. Carr, S. Lucey, D. Ramanan, and J. Hays, “Argoverse: 3d tracking and forecasting with rich maps,” in CVPR, 2019.
  123. B. Wilson, W. Qi, T. Agarwal, J. Lambert, J. Singh, S. Khandelwal, B. Pan, R. Kumar, A. Hartnett, J. K. Pontes, D. Ramanan, P. Carr, and J. Hays, “Argoverse 2: Next generation datasets for self-driving perception and forecasting,” in NeurIPS, 2021.
  124. P. Sun, H. Kretzschmar, X. Dotiwalla, A. Chouard, V. Patnaik, P. Tsui, J. Guo, Y. Zhou, Y. Chai, B. Caine, V. Vasudevan, W. Han, J. Ngiam, H. Zhao, A. Timofeev, S. Ettinger, M. Krivokon, A. Gao, A. Joshi, Y. Zhang, J. Shlens, Z. Chen, and D. Anguelov, “Scalability in perception for autonomous driving: Waymo open dataset,” in CVPR, 2020.
  125. T. Liang, H. Xie, K. Yu, Z. Xia, Z. Lin, Y. Wang, T. Tang, B. Wang, and Z. Tang, “BEVFusion: A simple and robust liDAR-camera fusion framework,” in NeurIPS, 2022.
  126. Z. Liu, H. Tang, A. Amini, X. Yang, H. Mao, D. Rus, and S. Han, “Bevfusion: Multi-task multi-sensor fusion with unified bird’s-eye view representation,” arXiv.org, vol. 2205.13542, 2022.
  127. A. Kim, A. Ošep, and L. Leal-Taixé, “Eagermot: 3d multi-object tracking via sensor fusion,” in ICRA, 2021.
  128. H.-k. Chiu, J. Li, R. Ambruş, and J. Bohg, “Probabilistic 3d multi-modal, multi-object tracking for autonomous driving,” in ICRA, 2021.
  129. R. Zhang, S. A. Candra, K. Vetter, and A. Zakhor, “Sensor fusion for semantic segmentation of urban scenes,” in ICRA, 2015.
  130. G. P. Meyer, J. Charland, D. Hegde, A. Laddha, and C. Vallespi-Gonzalez, “Sensor fusion for joint 3d object detection and semantic segmentation,” in CVPR Workshops, 2019.
  131. Y. Xiao, F. Codevilla, A. Gurram, O. Urfalioglu, and A. M. López, “Multimodal end-to-end autonomous driving,” TITS, 2020.
  132. B. Zhou, P. Krähenbühl, and V. Koltun, “Does computer vision matter for action?,” Science Robotics, 2019.
  133. P. Cai, S. Wang, H. Wang, and M. Liu, “Carl-lead: Lidar-based end-to-end autonomous driving with contrastive deep reinforcement learning,” arXiv.org, vol. 2109.08473, 2021.
  134. Z. Gao, Y. Mu, R. Shen, C. Chen, Y. Ren, J. Chen, S. E. Li, P. Luo, and Y. Lu, “Enhance sample efficiency and robustness of end-to-end urban autonomous driving via semantic masked world model,” in NeurIPS Workshops, 2022.
  135. J. Chen, S. E. Li, and M. Tomizuka, “Interpretable end-to-end urban autonomous driving with latent deep reinforcement learning,” TITS, 2022.
  136. Z. Huang, C. Lv, Y. Xing, and J. Wu, “Multi-modal sensor fusion-based deep neural network for end-to-end autonomous driving with scene understanding,” IEEE Sensors Journal, 2020.
  137. O. Natan and J. Miura, “Fully end-to-end autonomous driving with semantic depth cloud mapping and multi-agent,” in IV, 2022.
  138. I. Sobh, L. Amin, S. Abdelkarim, K. Elmadawy, M. Saeed, O. Abdeltawab, M. E. Gamal, and A. E. Sallab, “End-to-end multi-modal sensors fusion system for urban automated driving,” in NeurIPS Workshops, 2018.
  139. Y. Chen, J. Wang, J. Li, C. Lu, Z. Luo, H. Xue, and C. Wang, “Lidar-video driving dataset: Learning driving policies effectively,” in CVPR, 2018.
  140. H. M. Eraqi, M. N. Moustafa, and J. Honer, “Dynamic conditional imitation learning for autonomous driving,” TITS, 2022.
  141. S. Chowdhuri, T. Pankaj, and K. Zipser, “Multinet: Multi-modal multi-task learning for autonomous driving,” in WACV, 2019.
  142. P. Cai, S. Wang, Y. Sun, and M. Liu, “Probabilistic end-to-end vehicle navigation in complex dynamic environments with multimodal sensor fusion,” RA-L, 2020.
  143. Q. Zhang, M. Tang, R. Geng, F. Chen, R. Xin, and L. Wang, “Mmfn: Multi-modal-fusion-net for end-to-end driving,” in IROS, 2022.
  144. H. Shao, L. Wang, R. Chen, S. L. Waslander, H. Li, and Y. Liu, “Reasonnet: End-to-end driving with temporal and global reasoning,” in CVPR, 2023.
  145. Y. Li, A. W. Yu, T. Meng, B. Caine, J. Ngiam, D. Peng, J. Shen, Y. Lu, D. Zhou, Q. V. Le, et al., “Deepfusion: Lidar-camera deep fusion for multi-modal 3d object detection,” in CVPR, 2022.
  146. S. Borse, M. Klingner, V. R. Kumar, H. Cai, A. Almuzairee, S. Yogamani, and F. Porikli, “X-align: Cross-modal cross-view alignment for bird’s-eye-view segmentation,” in WACV, 2023.
  147. P. Anderson, Q. Wu, D. Teney, J. Bruce, M. Johnson, N. Sünderhauf, I. Reid, S. Gould, and A. Van Den Hengel, “Vision-and-language navigation: Interpreting visually-grounded navigation instructions in real environments,” in CVPR, 2018.
  148. M. Shridhar, L. Manuelli, and D. Fox, “Cliport: What and where pathways for robotic manipulation,” in CoRL, 2022.
  149. J. Duan, S. Yu, H. L. Tan, H. Zhu, and C. Tan, “A survey of embodied ai: From simulators to research tasks,” TETCI, 2022.
  150. S. Vemprala, R. Bonatti, A. Bucker, and A. Kapoor, “Chatgpt for robotics: Design principles and model abilities,” CoRR, 2023.
  151. T. Deruyttere, S. Vandenhende, D. Grujicic, L. Van Gool, and M. F. Moens, “Talk2car: Taking control of your self-driving car,” in EMNLP, 2019.
  152. P. Mirowski, M. Grimes, M. Malinowski, K. M. Hermann, K. Anderson, D. Teplyashin, K. Simonyan, A. Zisserman, R. Hadsell, et al., “Learning to navigate in cities without a map,” in NeurIPS, 2018.
  153. H. Chen, A. Suhr, D. Misra, N. Snavely, and Y. Artzi, “Touchdown: Natural language navigation and spatial reasoning in visual street environments,” in CVPR, 2019.
  154. R. Schumann and S. Riezler, “Generating landmark navigation instructions from maps as a graph-to-text problem,” in ACL, 2021.
  155. J. Kim, T. Misu, Y.-T. Chen, A. Tawari, and J. Canny, “Grounding human-to-vehicle advice for self-driving vehicles,” in CVPR, 2019.
  156. S. Narayanan, T. Maniar, J. Kalyanasundaram, V. Gandhi, B. Bhowmick, and K. M. Krishna, “Talk to the vehicle: Language conditioned autonomous navigation of self driving cars,” in IROS, 2019.
  157. J. Kim, S. Moon, A. Rohrbach, T. Darrell, and J. Canny, “Advisable learning for self-driving vehicles by internalizing observation-to-action rules,” in CVPR, 2020.
  158. J. Roh, C. Paxton, A. Pronobis, A. Farhadi, and D. Fox, “Conditional driving from natural language instructions,” in CoRL, 2019.
  159. K. Jain, V. Chhangani, A. Tiwari, K. M. Krishna, and V. Gandhi, “Ground then navigate: Language-guided navigation in dynamic scenes,” arXiv.org, vol. 2209.11972, 2022.
  160. D. Shah, B. Osiński, b. ichter, and S. Levine, “Lm-nav: Robotic navigation with large pre-trained models of language, vision, and action,” in CoRL, 2023.
  161. A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clark, G. Krueger, and I. Sutskever, “Learning transferable visual models from natural language supervision,” in ICML, 2021.
  162. T. B. Brown, B. Mann, N. Ryder, M. Subbiah, J. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, S. Agarwal, A. Herbert-Voss, G. Krueger, T. Henighan, R. Child, A. Ramesh, D. M. Ziegler, J. Wu, C. Winter, C. Hesse, M. Chen, E. Sigler, M. Litwin, S. Gray, B. Chess, J. Clark, C. Berner, S. McCandlish, A. Radford, I. Sutskever, and D. Amodei, “Language models are few-shot learners,” in NeurIPS, 2020.
  163. L. Ouyang, J. Wu, X. Jiang, D. Almeida, C. Wainwright, P. Mishkin, C. Zhang, S. Agarwal, K. Slama, A. Ray, et al., “Training language models to follow instructions with human feedback,” in NeurIPS, 2022.
  164. OpenAI, “OpenAI: Introducing ChatGPT.” https://openai.com/blog/chatgpt, 2022.
  165. OpenAI, “GPT-4 Technical Report,” arXiv.org, vol. 2303.08774, 2023.
  166. B. Hilleli and R. El-Yaniv, “Toward deep reinforcement learning without a simulator: An autonomous steering example,” in AAAI, 2018.
  167. G. Wang, H. Niu, D. Zhu, J. Hu, X. Zhan, and G. Zhou, “A versatile and efficient reinforcement learning framework for autonomous driving,” arXiv.org, vol. 2110.11573, 2021.
  168. A. Behl, K. Chitta, A. Prakash, E. Ohn-Bar, and A. Geiger, “Label efficient visual abstractions for autonomous driving,” in IROS, 2020.
  169. S.-H. Chung, S.-H. Kong, S. Cho, and I. M. A. Nahrendra, “Segmented encoding for sim2real of rl-based end-to-end autonomous driving,” in IV, 2022.
  170. D. P. Kingma and M. Welling, “Auto-encoding variational bayes,” arXiv.org, vol. 1312.6114, 2013.
  171. M. Ahmed, A. Abobakr, C. P. Lim, and S. Nahavandi, “Policy-based reinforcement learning for training autonomous driving agents in urban areas with affordance learning,” TITS, 2022.
  172. A. Sauer, N. Savinov, and A. Geiger, “Conditional affordance learning for driving in urban environments,” in CoRL, 2018.
  173. Z. Yuan, Z. Xue, B. Yuan, X. Wang, Y. Wu, Y. Gao, and H. Xu, “Pre-trained image encoder for generalizable visual reinforcement learning,” in NeurIPS, 2022.
  174. J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “Imagenet: A large-scale hierarchical image database,” in CVPR, 2009.
  175. X. Zhang, M. Wu, H. Ma, T. Hu, and J. Yuan, “Multi-task long-range urban driving based on hierarchical planning and reinforcement learning,” in ITSC, 2021.
  176. C. Huang, R. Zhang, M. Ouyang, P. Wei, J. Lin, J. Su, and L. Lin, “Deductive reinforcement learning for visual autonomous urban driving navigation,” TNNLS, 2021.
  177. R. Cheng, C. Agia, F. Shkurti, D. Meger, and G. Dudek, “Latent attention augmentation for robust autonomous driving policies,” in IROS, 2021.
  178. J. Yamada, K. Pertsch, A. Gunjal, and J. J. Lim, “Task-induced representation learning,” in ICLR, 2022.
  179. J. Chen and S. Pan, “Learning generalizable representations for reinforcement learning via adaptive meta-learner of behavioral similarities,” in ICLR, 2022.
  180. J. Wu, Z. Huang, and C. Lv, “Uncertainty-aware model-based reinforcement learning: Methodology and application in autonomous driving,” IV, 2022.
  181. M. Pan, X. Zhu, Y. Wang, and X. Yang, “Iso-dream: Isolating and leveraging noncontrollable visual dynamics in world models,” in NeurIPS, 2022.
  182. A. Hu, G. Corrado, N. Griffiths, Z. Murez, C. Gurau, H. Yeo, A. Kendall, R. Cipolla, and J. Shotton, “Model-based imitation learning for urban driving,” in NeurIPS, 2022.
  183. R. Caruana, “Multitask learning,” Machine Learning, 1997.
  184. A. Argyriou, T. Evgeniou, and M. Pontil, “Multi-task feature learning,” in NeurIPS, 2006.
  185. K. Ishihara, A. Kanervisto, J. Miura, and V. Hautamaki, “Multi-task learning with attention for end-to-end autonomous driving,” in CVPR Workshops, 2021.
  186. Z. Li, T. Motoyoshi, K. Sasaki, T. Ogata, and S. Sugano, “Rethinking self-driving: Multi-task knowledge for better generalization and accident explanation ability,” arXiv.org, vol. 1809.11100, 2018.
  187. H. Xu, Y. Gao, F. Yu, and T. Darrell, “End-to-end learning of driving models from large-scale video datasets,” in CVPR, 2017.
  188. A. Mehta, A. Subramanian, and A. Subramanian, “Learning end-to-end autonomous driving using guided auxiliary supervision,” in ICVGIP, 2018.
  189. Y. Hou, Z. Ma, C. Liu, and C. C. Loy, “Learning to steer by mimicking features from heterogeneous auxiliary networks,” in AAAI, 2019.
  190. A. Zhao, T. He, Y. Liang, H. Huang, G. Van den Broeck, and S. Soatto, “Sam: Squeeze-and-mimic networks for conditional visual driving policy learning,” in CoRL, 2020.
  191. É. Zablocki, H. Ben-Younes, P. Pérez, and M. Cord, “Explainability of deep vision-based autonomous driving systems: Review and challenges,” IJCV, 2022.
  192. M. Bojarski, P. Yeres, A. Choromanska, K. Choromanski, B. Firner, L. Jackel, and U. Muller, “Explaining how a deep neural network trained with end-to-end learning steers a car,” arXiv.org, vol. 1704.07911, 2017.
  193. M. Bojarski, A. Choromanska, K. Choromanski, B. Firner, L. J. Ackel, U. Muller, P. Yeres, and K. Zieba, “Visualbackprop: Efficient visualization of cnns for autonomous driving,” in ICRA, 2018.
  194. S. Mohseni, A. Jagadeesh, and Z. Wang, “Predicting model failure using saliency maps in autonomous driving systems,” arXiv.org, vol. 1905.07679, 2019.
  195. J. Kim and J. Canny, “Interpretable learning for self-driving cars by visualizing causal attention,” in ICCV, 2017.
  196. K. Mori, H. Fukui, T. Murase, T. Hirakawa, T. Yamashita, and H. Fujiyoshi, “Visual explanation by attention branch network for end-to-end learning-based self-driving,” in IV, 2019.
  197. D. Wang, C. Devin, Q.-Z. Cai, F. Yu, and T. Darrell, “Deep object-centric policies for autonomous driving,” in ICRA, 2019.
  198. L. Cultrera, L. Seidenari, F. Becattini, P. Pala, and A. Del Bimbo, “Explaining autonomous driving by learning end-to-end visual attention,” in CVPR Workshops, 2020.
  199. Y. Xiao, F. Codevilla, D. P. Bustamante, and A. M. Lopez, “Scaling self-supervised end-to-end driving with multi-view attention learning,” arXiv.org, vol. 2302.03198, 2023.
  200. K. Renz, K. Chitta, O.-B. Mercea, A. S. Koepke, Z. Akata, and A. Geiger, “Plant: Explainable planning transformers via object-level representations,” in CoRL, 2022.
  201. C. Liu, Y. Chen, M. Liu, and B. E. Shi, “Using eye gaze to enhance generalization of imitation networks to unseen environments,” TNNLS, 2020.
  202. W. Zeng, S. Wang, R. Liao, Y. Chen, B. Yang, and R. Urtasun, “Dsdnet: Deep structured self-driving network,” in ECCV, 2020.
  203. A. Cui, S. Casas, A. Sadat, R. Liao, and R. Urtasun, “Lookout: Diverse multi-future prediction and planning for self-driving,” in ICCV, 2021.
  204. H. Ben-Younes, É. Zablocki, P. Pérez, and M. Cord, “Driving behavior explanation with multi-level fusion,” Pattern Recognition, 2022.
  205. Y. Xu, X. Yang, L. Gong, H.-C. Lin, T.-Y. Wu, Y. Li, and N. Vasconcelos, “Explainable object-induced action decision for autonomous vehicles,” in CVPR, 2020.
  206. B. Jin, X. Liu, Y. Zheng, P. Li, H. Zhao, T. Zhang, Y. Zheng, G. Zhou, and J. Liu, “Adapt: Action-aware driving caption transformer,” in ICRA, 2023.
  207. R. Michelmore, M. Kwiatkowska, and Y. Gal, “Evaluating uncertainty quantification in end-to-end autonomous driving control,” arXiv.org, vol. 1811.06817, 2018.
  208. A. Filos, P. Tigkas, R. McAllister, N. Rhinehart, S. Levine, and Y. Gal, “Can autonomous vehicles identify, recover from, and adapt to distribution shifts?,” in ICML, 2020.
  209. L. Tai, P. Yun, Y. Chen, C. Liu, H. Ye, and M. Liu, “Visual-based autonomous driving deployment from a stochastic and uncertainty-aware perspective,” in IROS, 2019.
  210. P. Cai, Y. Sun, H. Wang, and M. Liu, “Vtgnet: A vision-based trajectory generation network for autonomous vehicles in urban environments,” IV, 2020.
  211. R. Geirhos, J. Jacobsen, C. Michaelis, R. S. Zemel, W. Brendel, M. Bethge, and F. A. Wichmann, “Shortcut learning in deep neural networks,” Nature Machine Intelligence, 2020.
  212. P. de Haan, D. Jayaraman, and S. Levine, “Causal confusion in imitation learning,” in NeurIPS, 2019.
  213. U. Muller, J. Ben, E. Cosatto, B. Flepp, and Y. LeCun, “Off-road obstacle avoidance through end-to-end learning,” in NeurIPS, 2005.
  214. M. Bansal, A. Krizhevsky, and A. S. Ogale, “Chauffeurnet: Learning to drive by imitating the best and synthesizing the worst,” in RSS, 2019.
  215. C. Chuang, D. Yang, C. Wen, and Y. Gao, “Resolving copycat problems in visual imitation learning via residual action prediction,” in ECCV, 2022.
  216. L. Shen, Z. Lin, and Q. Huang, “Relay backpropagation for effective learning of deep convolutional neural networks,” in ECCV, 2016.
  217. M. Buda, A. Maki, and M. A. Mazurowski, “A systematic study of the class imbalance problem in convolutional neural networks,” NN, 2018.
  218. J. Byrd and Z. Lipton, “What is the effect of importance weighting in deep learning?,” in ICML, 2019.
  219. A. Gupta, P. Dollar, and R. Girshick, “Lvis: A dataset for large vocabulary instance segmentation,” in CVPR, 2019.
  220. J. Peng, X. Bu, M. Sun, Z. Zhang, T. Tan, and J. Yan, “Large-scale object detection in the wild from imbalanced multi-labels,” in CVPR, 2020.
  221. I. Mani and I. Zhang, “knn approach to unbalanced data distributions: a case study involving information extraction,” in ICML Workshops, 2003.
  222. X.-Y. Liu, J. Wu, and Z.-H. Zhou, “Exploratory undersampling for class-imbalance learning,” TCYB, 2008.
  223. D. Devi, B. Purkayastha, et al., “Redundancy-driven modified tomek-link based undersampling: A solution to class imbalance,” Pattern Recognition Letters, 2017.
  224. S. Gidaris and N. Komodakis, “Dynamic few-shot visual learning without forgetting,” in CVPR, 2018.
  225. H. Zhang, M. Cisse, Y. N. Dauphin, and D. Lopez-Paz, “mixup: Beyond empirical risk minimization,” in ICLR, 2017.
  226. H.-P. Chou, S.-C. Chang, J.-Y. Pan, W. Wei, and D.-C. Juan, “Remix: rebalanced mixup,” in ECCV, 2020.
  227. C. Huang, Y. Li, C. C. Loy, and X. Tang, “Learning deep representation for imbalanced classification,” in CVPR, 2016.
  228. Y.-X. Wang, D. Ramanan, and M. Hebert, “Learning to model the tail,” in NeurIPS, 2017.
  229. T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár, “Focal loss for dense object detection,” in ICCV, 2017.
  230. Y. Cui, M. Jia, T.-Y. Lin, Y. Song, and S. Belongie, “Class-balanced loss based on effective number of samples,” in CVPR, 2019.
  231. J. Tan, C. Wang, B. Li, Q. Li, W. Ouyang, C. Yin, and J. Yan, “Equalization loss for long-tailed object recognition,” in CVPR, 2020.
  232. J. Tan, X. Lu, G. Zhang, C. Yin, and Q. Li, “Equalization loss v2: A new gradient balance approach for long-tailed object detection,” in CVPR, 2021.
  233. B. Li, Y. Yao, J. Tan, G. Zhang, F. Yu, J. Lu, and Y. Luo, “Equalized focal loss for dense long-tailed object detection,” in CVPR, 2022.
  234. S. Akhauri, L. Y. Zheng, and M. C. Lin, “Enhanced transfer learning for autonomous driving with systematic accident simulation,” in IROS, 2020.
  235. Q. Li, Z. Peng, Q. Zhang, C. Liu, and B. Zhou, “Improving the generalization of end-to-end driving through procedural generation,” arXiv.org, vol. 2012.13681, 2020.
  236. M. O’Kelly, A. Sinha, H. Namkoong, R. Tedrake, and J. C. Duchi, “Scalable end-to-end autonomous vehicle testing via rare-event simulation,” in NeurIPS, 2018.
  237. W. Ding, B. Chen, B. Li, K. J. Eun, and D. Zhao, “Multimodal safety-critical scenarios generation for decision-making algorithms evaluation,” RA-L, 2021.
  238. L. T. Triess, M. Dreissig, C. B. Rist, and J. M. Zöllner, “A survey on deep domain adaptation for lidar perception,” in IV Workshops, 2021.
  239. Y. You, X. Pan, Z. Wang, and C. Lu, “Virtual to real reinforcement learning for autonomous driving,” in BMVC, 2017.
  240. A. Bewley, J. Rigley, Y. Liu, J. Hawke, R. Shen, V.-D. Lam, and A. Kendall, “Learning to drive from simulation without real world labels,” in ICRA, 2019.
  241. J. Xing, T. Nagata, K. Chen, X. Zou, E. Neftci, and J. L. Krichmar, “Domain adaptation in reinforcement learning via latent unified state representation,” in AAAI, 2021.
  242. J. Tobin, R. Fong, A. Ray, J. Schneider, W. Zaremba, and P. Abbeel, “Domain randomization for transferring deep neural networks from simulation to the real world,” in IROS, 2017.
  243. X. B. Peng, M. Andrychowicz, W. Zaremba, and P. Abbeel, “Sim-to-real transfer of robotic control with dynamics randomization,” in ICRA, 2018.
  244. J. Matas, S. James, and A. J. Davison, “Sim-to-real reinforcement learning for deformable object manipulation,” in CoRL, 2018.
  245. B. Osiński, A. Jakubowski, P. Zięcina, P. Miłoś, C. Galias, S. Homoceanu, and H. Michalewski, “Simulation-based reinforcement learning for real-world autonomous driving,” in ICRA, 2020.
  246. M. Tancik, V. Casser, X. Yan, S. Pradhan, B. Mildenhall, P. P. Srinivasan, J. T. Barron, and H. Kretzschmar, “Block-nerf: Scalable large scene neural view synthesis,” in CVPR, 2022.
  247. P. Karkus, B. Ivanovic, S. Mannor, and M. Pavone, “Diffstack: A differentiable and modular control stack for autonomous vehicles,” in CoRL, 2022.
  248. A. Kirillov, E. Mintun, N. Ravi, H. Mao, C. Rolland, L. Gustafson, T. Xiao, S. Whitehead, A. C. Berg, W.-Y. Lo, P. Dollár, and R. Girshick, “Segment anything,” arXiv.org, vol. 2304.02643, 2023.
  249. H. Touvron, T. Lavril, G. Izacard, X. Martinet, M.-A. Lachaux, T. Lacroix, B. Rozière, N. Goyal, E. Hambro, F. Azhar, et al., “Llama: Open and efficient foundation language models,” arXiv.org, vol. 2302.13971, 2023.
  250. S. Narang and A. Chowdhery, “Pathways language model (palm): Scaling to 540 billion parameters for breakthrough performance,” 2022.
  251. Y. Fang, W. Wang, B. Xie, Q. Sun, L. Wu, X. Wang, T. Huang, X. Wang, and Y. Cao, “Eva: Exploring the limits of masked visual representation learning at scale,” arXiv.org, vol. 2211.07636, 2022.
  252. Q. Sun, Y. Fang, L. Wu, X. Wang, and Y. Cao, “Eva-clip: Improved training techniques for clip at scale,” arXiv.org, vol. 2303.15389, 2023.
  253. M. Oquab, T. Darcet, T. Moutakanni, H. Vo, M. Szafraniec, V. Khalidov, P. Fernandez, D. Haziza, F. Massa, A. El-Nouby, et al., “Dinov2: Learning robust visual features without supervision,” arXiv.org, vol. 2304.07193, 2023.
  254. J.-B. Alayrac, J. Donahue, P. Luc, A. Miech, I. Barr, Y. Hasson, K. Lenc, A. Mensch, K. Millican, M. Reynolds, et al., “Flamingo: a visual language model for few-shot learning,” in NeurIPS, 2022.
  255. W. Wang, J. Dai, Z. Chen, Z. Huang, Z. Li, X. Zhu, X. Hu, T. Lu, L. Lu, H. Li, et al., “Internimage: Exploring large-scale vision foundation models with deformable convolutions,” in CVPR, 2023.
  256. K. He, X. Chen, S. Xie, Y. Li, P. Dollár, and R. Girshick, “Masked autoencoders are scalable vision learners,” in CVPR, 2022.
  257. T.-H. Wang, S. Manivasagam, M. Liang, B. Yang, W. Zeng, and R. Urtasun, “V2vnet: Vehicle-to-vehicle communication for joint perception and prediction,” in ECCV, 2020.
  258. Y.-C. Liu, J. Tian, C.-Y. Ma, N. Glaser, C.-W. Kuo, and Z. Kira, “Who2com: Collaborative perception via learnable handshake communication,” in ICRA, 2020.
  259. Y.-C. Liu, J. Tian, N. Glaser, and Z. Kira, “When2com: Multi-agent perception via communication graph grouping,” in CVPR, 2020.
  260. J. Cui, H. Qiu, D. Chen, P. Stone, and Y. Zhu, “Coopernaut: End-to-end driving with cooperative perception for networked vehicles,” in CVPR, 2022.
  261. R. Xu, H. Xiang, Z. Tu, X. Xia, M.-H. Yang, and J. Ma, “V2x-vit: Vehicle-to-everything cooperative perception with vision transformer,” in ECCV, 2022.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Li Chen (590 papers)
  2. Penghao Wu (17 papers)
  3. Kashyap Chitta (30 papers)
  4. Bernhard Jaeger (7 papers)
  5. Andreas Geiger (136 papers)
  6. Hongyang Li (99 papers)
Citations (160)
Github Logo Streamline Icon: https://streamlinehq.com