HPL-ViT: A Unified Perception Framework for Heterogeneous Parallel LiDARs in V2V (2309.15572v1)
Abstract: To develop the next generation of intelligent LiDARs, we propose a novel framework of parallel LiDARs and construct a hardware prototype in our experimental platform, DAWN (Digital Artificial World for Natural). It emphasizes the tight integration of physical and digital space in LiDAR systems, with networking being one of its supported core features. In the context of autonomous driving, V2V (Vehicle-to-Vehicle) technology enables efficient information sharing between different agents which significantly promotes the development of LiDAR networks. However, current research operates under an ideal situation where all vehicles are equipped with identical LiDAR, ignoring the diversity of LiDAR categories and operating frequencies. In this paper, we first utilize OpenCDA and RLS (Realistic LiDAR Simulation) to construct a novel heterogeneous LiDAR dataset named OPV2V-HPL. Additionally, we present HPL-ViT, a pioneering architecture designed for robust feature fusion in heterogeneous and dynamic scenarios. It uses a graph-attention Transformer to extract domain-specific features for each agent, coupled with a cross-attention mechanism for the final fusion. Extensive experiments on OPV2V-HPL demonstrate that HPL-ViT achieves SOTA (state-of-the-art) performance in all settings and exhibits outstanding generalization capabilities.
- R. Roriz, J. Cabral, and T. Gomes, “Automotive lidar technology: A survey,” IEEE Transactions on Intelligent Transportation Systems, vol. 23, no. 7, pp. 6282–6297, 2022.
- Y. Liu, Y. Shen, L. Fan, Y. Tian, Y. Ai, B. Tian, Z. Liu, and F.-Y. Wang, “Parallel radars: from digital twins to digital intelligence for smart radar systems,” Sensors, vol. 22, no. 24, p. 9930, 2022.
- Y. Liu, Y. Shen, Y. Tian, Y. Ai, B. Tian, E. Wu, and L. Chen, “Radarverses in metaverses: A cpsi-based architecture for 6s radar systems in cpss,” IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 53, no. 4, pp. 2128–2137, 2022.
- Y. Liu, B. Sun, Y. Tian, X. Wang, Y. Zhu, R. Huai, and Y. Shen, “Software-defined active lidars for autonomous driving: A parallel intelligence-based adaptive model,” IEEE Transactions on Intelligent Vehicles, pp. 1–10, 2023.
- A. Caillot, S. Ouerghi, P. Vasseur, R. Boutteau, and Y. Dupuis, “Survey on cooperative perception in an automotive context,” IEEE Transactions on Intelligent Transportation Systems, vol. 23, no. 9, pp. 14 204–14 223, 2022.
- H. Gao, B. Cheng, J. Wang, K. Li, J. Zhao, and D. Li, “Object classification using cnn-based fusion of vision and lidar in autonomous vehicle environment,” IEEE Transactions on Industrial Informatics, vol. 14, no. 9, pp. 4224–4231, 2018.
- Z. Song, F. Wen, H. Zhang, and J. Li, “A cooperative perception system robust to localization errors,” in 2023 IEEE Intelligent Vehicles Symposium (IV). IEEE, 2023, pp. 1–6.
- R. Xu, W. Chen, H. Xiang, X. Xia, L. Liu, and J. Ma, “Model-agnostic multi-agent perception framework,” in 2023 IEEE International Conference on Robotics and Automation (ICRA), 2023, pp. 1471–1478.
- R. Xu, H. Xiang, Z. Tu, X. Xia, M.-H. Yang, and J. Ma, “V2x-vit: Vehicle-to-everything cooperative perception with vision transformer,” in European conference on computer vision. Springer, 2022, pp. 107–124.
- Z. Lei, S. Ren, Y. Hu, W. Zhang, and S. Chen, “Latency-aware collaborative perception,” in European Conference on Computer Vision. Springer, 2022, pp. 316–332.
- Y. Hu, S. Fang, Z. Lei, Y. Zhong, and S. Chen, “Where2comm: Communication-efficient collaborative perception via spatial confidence maps,” Advances in neural information processing systems, vol. 35, pp. 4874–4886, 2022.
- J. Li, R. Xu, X. Liu, J. Ma, Z. Chi, J. Ma, and H. Yu, “Learning for vehicle-to-vehicle cooperative perception under lossy communication,” IEEE Transactions on Intelligent Vehicles, vol. 8, no. 4, pp. 2650–2660, 2023.
- R. Xu, H. Xiang, X. Xia, X. Han, J. Li, and J. Ma, “Opv2v: An open benchmark dataset and fusion pipeline for perception with vehicle-to-vehicle communication,” in 2022 International Conference on Robotics and Automation (ICRA). IEEE, 2022, pp. 2583–2589.
- A. Dosovitskiy, G. Ros, F. Codevilla, A. Lopez, and V. Koltun, “Carla: An open urban driving simulator,” in Conference on robot learning. PMLR, 2017, pp. 1–16.
- R. Xu, X. Xia, J. Li, H. Li, S. Zhang, Z. Tu, Z. Meng, H. Xiang, X. Dong, R. Song et al., “V2v4real: A real-world large-scale dataset for vehicle-to-vehicle cooperative perception,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 13 712–13 722.
- R. Xu, Y. Guo, X. Han, X. Xia, H. Xiang, and J. Ma, “Opencda: An open cooperative driving automation framework integrated with co-simulation,” in 2021 IEEE International Intelligent Transportation Systems Conference (ITSC), 2021, pp. 1155–1162.
- X. Cai, W. Jiang, R. Xu, W. Zhao, J. Ma, S. Liu, and Y. Li, “Analyzing infrastructure lidar placement with realistic lidar simulation library,” in 2023 IEEE International Conference on Robotics and Automation (ICRA), 2023, pp. 5581–5587.
- F.-Y. Wang, “Parallel system methods for management and control of complex systems,” CONTROL AND DECISION., vol. 19, pp. 485–489, 2004.
- Q. Wei, H. Li, and F.-Y. Wang, “Parallel control for continuous-time linear systems: A case study,” IEEE/CAA Journal of Automatica Sinica, vol. 7, no. 4, pp. 919–928, 2020.
- F.-Y. Wang, “The dao to metacontrol for metasystems in metaverses: The system of parallel control systems for knowledge automation and control intelligence in cpss,” IEEE/CAA Journal of Automatica Sinica, vol. 9, no. 11, pp. 1899–1908, 2022.
- Y. Shen, Y. Liu, Y. Tian, and X. Na, “Parallel sensing in metaverses: Virtual-real interactive smart systems for “6s” sensing,” IEEE/CAA Journal of Automatica Sinica, vol. 9, no. 12, pp. 2047–2054, 2022.
- Y. Liu, Y. Shen, C. Guo, Y. Tian, X. Wang, Y. Zhu, and F.-Y. Wang, “Metasensing in metaverses: See there, be there, and know there,” IEEE Intelligent Systems, vol. 37, no. 6, pp. 7–12, 2022.
- F.-Y. Wang, N.-N. Zheng, D. Cao, C. M. Martinez, L. Li, and T. Liu, “Parallel driving in cpss: A unified approach for transport automation and vehicle intelligence,” IEEE/CAA Journal of Automatica Sinica, vol. 4, no. 4, pp. 577–587, 2017.
- T. Liu, Y. Xing, X. Tang, H. Wang, H. Yu, and F.-Y. Wang, “Cyber-physical-social system for parallel driving: From concept to application,” IEEE Intelligent Transportation Systems Magazine, vol. 13, no. 1, pp. 59–69, 2020.
- F. Wang, X. Meng, S. Du, and Z. Geng, “Parallel light field: The framework and processes,” Chin. J. Intell. Sci. Technol, vol. 3, no. 1, pp. 110–122, 2021.
- Y. Liu, Y. Tian, B. Sun, Y. Wang, and F.-Y. Wang, “Parallel lidars meet the foggy weather,” IEEE Journal of Radio Frequency Identification, vol. 6, pp. 867–870, 2022.
- J. Mao, S. Shi, X. Wang, and H. Li, “3d object detection for autonomous driving: A review and new outlooks,” arXiv preprint arXiv:2206.09474, 2022.
- C. R. Qi, H. Su, K. Mo, and L. J. Guibas, “Pointnet: Deep learning on point sets for 3d classification and segmentation,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 652–660.
- C. R. Qi, L. Yi, H. Su, and L. J. Guibas, “Pointnet++: Deep hierarchical feature learning on point sets in a metric space,” Advances in neural information processing systems, vol. 30, 2017.
- W. Shi and R. Rajkumar, “Point-gnn: Graph neural network for 3d object detection in a point cloud,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 1711–1719.
- X. Pan, Z. Xia, S. Song, L. E. Li, and G. Huang, “3d object detection with pointformer,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 7463–7472.
- Y. Zhou and O. Tuzel, “Voxelnet: End-to-end learning for point cloud based 3d object detection,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 4490–4499.
- Y. Yan, Y. Mao, and B. Li, “Second: Sparsely embedded convolutional detection,” Sensors, vol. 18, no. 10, p. 3337, 2018.
- A. H. Lang, S. Vora, H. Caesar, L. Zhou, J. Yang, and O. Beijbom, “Pointpillars: Fast encoders for object detection from point clouds,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 12 697–12 705.
- S. Shi, C. Guo, L. Jiang, Z. Wang, J. Shi, X. Wang, and H. Li, “Pv-rcnn: Point-voxel feature set abstraction for 3d object detection,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 10 529–10 538.
- G. P. Meyer, A. Laddha, E. Kee, C. Vallespi-Gonzalez, and C. K. Wellington, “Lasernet: An efficient probabilistic 3d object detector for autonomous driving,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 12 677–12 686.
- H. Yu, Y. Luo, M. Shu, Y. Huo, Z. Yang, Y. Shi, Z. Guo, H. Li, X. Hu, J. Yuan et al., “Dair-v2x: A large-scale dataset for vehicle-infrastructure cooperative 3d object detection,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 21 361–21 370.
- T.-H. Wang, S. Manivasagam, M. Liang, B. Yang, W. Zeng, and R. Urtasun, “V2vnet: Vehicle-to-vehicle communication for joint perception and prediction,” in Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part II 16. Springer, 2020, pp. 605–621.
- Y. Li, S. Ren, P. Wu, S. Chen, C. Feng, and W. Zhang, “Learning distilled collaboration graph for multi-agent perception,” Advances in Neural Information Processing Systems, vol. 34, pp. 29 541–29 552, 2021.
- R. Xu, J. Li, X. Dong, H. Yu, and J. Ma, “Bridging the domain gap for multi-agent perception,” in 2023 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2023, pp. 6035–6042.
- H. Xiang, R. Xu, and J. Ma, “Hm-vit: Hetero-modal vehicle-to-vehicle cooperative perception with vision transformer,” arXiv preprint arXiv:2304.10628, 2023.
- Z. Tu, H. Talebi, H. Zhang, F. Yang, P. Milanfar, A. Bovik, and Y. Li, “Maxvit: Multi-axis vision transformer,” in European conference on computer vision. Springer, 2022, pp. 459–479.
- H. X. W. S. B. Z. J. M. Runsheng Xu, Zhengzhong Tu, “Cobevt: Cooperative bird’s eye view semantic segmentation with sparse transformers,” in Conference on Robot Learning (CoRL), 2022.
- Q. Chen, X. Ma, S. Tang, J. Guo, Q. Yang, and S. Fu, “F-cooper: Feature based cooperative perception for autonomous vehicle edge computing system using 3d point clouds,” in Proceedings of the 4th ACM/IEEE Symposium on Edge Computing, 2019, pp. 88–100.