Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Towards Full-scene Domain Generalization in Multi-agent Collaborative Bird's Eye View Segmentation for Connected and Autonomous Driving (2311.16754v3)

Published 28 Nov 2023 in cs.CV and cs.AI

Abstract: Collaborative perception has recently gained significant attention in autonomous driving, improving perception quality by enabling the exchange of additional information among vehicles. However, deploying collaborative perception systems can lead to domain shifts due to diverse environmental conditions and data heterogeneity among connected and autonomous vehicles (CAVs). To address these challenges, we propose a unified domain generalization framework to be utilized during the training and inference stages of collaborative perception. In the training phase, we introduce an Amplitude Augmentation (AmpAug) method to augment low-frequency image variations, broadening the model's ability to learn across multiple domains. We also employ a meta-consistency training scheme to simulate domain shifts, optimizing the model with a carefully designed consistency loss to acquire domain-invariant representations. In the inference phase, we introduce an intra-system domain alignment mechanism to reduce or potentially eliminate the domain discrepancy among CAVs prior to inference. Extensive experiments substantiate the effectiveness of our method in comparison with the existing state-of-the-art works.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (37)
  1. Y.-C. Liu, J. Tian, N. Glaser, and Z. Kira, “When2com: Multi-Agent Perception via Communication Graph Grouping,” in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).   Seattle, WA, USA: IEEE, Jun. 2020, pp. 4105–4114.
  2. Y. Hu, S. Fang, Z. Lei, Y. Zhong, and S. Chen, “Where2comm: Communication-Efficient Collaborative Perception via Spatial Confidence Maps,” Sep. 2022, arXiv:2209.12836 [cs].
  3. K. Yang, D. Yang, J. Zhang, H. Wang, P. Sun, and L. Song, “What2comm: Towards Communication-efficient Collaborative Perception via Feature Decoupling,” 2023.
  4. Y.-C. Liu, J. Tian, C.-Y. Ma, N. Glaser, C.-W. Kuo, and Z. Kira, “Who2com: Collaborative Perception via Learnable Handshake Communication,” Mar. 2020, arXiv:2003.09575 [cs].
  5. R. Xu, H. Xiang, Z. Tu, X. Xia, M.-H. Yang, and J. Ma, “V2X-ViT: Vehicle-to-Everything Cooperative Perception with Vision Transformer,” Aug. 2022, arXiv:2203.10638 [cs].
  6. X. Chen, Y. Deng, H. Ding, G. Qu, H. Zhang, P. Li, and Y. Fang, “Vehicle as a Service (VaaS): Leverage Vehicles to Build Service Networks and Capabilities for Smart Cities,” Apr. 2023, arXiv:2304.11397 [cs].
  7. Q. Chen, S. Tang, Q. Yang, and S. Fu, “Cooper: Cooperative Perception for Connected Autonomous Vehicles based on 3D Point Clouds,” May 2019, arXiv:1905.05265 [cs] version: 1.
  8. S. Hu, Z. Fang, H. An, G. Xu, Y. Zhou, X. Chen, and Y. Fang, “Adaptive Communications in Collaborative Perception with Domain Alignment for Autonomous Driving,” Oct. 2023, arXiv:2310.00013 [cs].
  9. R. Xu, H. Xiang, X. Xia, X. Han, J. Li, and J. Ma, “OPV2V: An Open Benchmark Dataset and Fusion Pipeline for Perception with Vehicle-to-Vehicle Communication,” Jun. 2022, arXiv:2109.07644 [cs].
  10. H. Caesar, V. Bankiti, A. H. Lang, S. Vora, V. E. Liong, Q. Xu, A. Krishnan, Y. Pan, G. Baldan, and O. Beijbom, “nuScenes: A Multimodal Dataset for Autonomous Driving,” in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).   Seattle, WA, USA: IEEE, Jun. 2020, pp. 11 618–11 628.
  11. J. Li, R. Xu, J. Ma, Q. Zou, J. Ma, and H. Yu, “Domain Adaptation based Enhanced Detection for Autonomous Driving in Foggy and Rainy Weather,” Jul. 2023, arXiv:2307.09676 [cs].
  12. K. Zhou, Y. Yang, Y. Qiao, and T. Xiang, “Domain Generalization with MixStyle,” Apr. 2021, arXiv:2104.02008 [cs].
  13. H. Li, S. J. Pan, S. Wang, and A. C. Kot, “Domain Generalization with Adversarial Feature Learning,” in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.   Salt Lake City, UT: IEEE, Jun. 2018, pp. 5400–5409.
  14. Y. Yang and S. Soatto, “FDA: Fourier Domain Adaptation for Semantic Segmentation,” in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).   Seattle, WA, USA: IEEE, Jun. 2020, pp. 4084–4094.
  15. R. Volpi, H. Namkoong, O. Sener, J. C. Duchi, V. Murino, and S. Savarese, “Generalizing to Unseen Domains via Adversarial Data Augmentation,” in Advances in Neural Information Processing Systems, vol. 31.   Curran Associates, Inc., 2018.
  16. Y. Li, D. Ma, Z. An, Z. Wang, Y. Zhong, S. Chen, and C. Feng, “V2X-Sim: Multi-Agent Collaborative Perception Dataset and Benchmark for Autonomous Driving,” Jul. 2022, arXiv:2202.08449 [cs].
  17. K. Zhou, Z. Liu, Y. Qiao, T. Xiang, and C. C. Loy, “Domain Generalization: A Survey,” IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 1–20, 2022, arXiv:2103.02503 [cs].
  18. T.-H. Wang, S. Manivasagam, M. Liang, B. Yang, W. Zeng, J. Tu, and R. Urtasun, “V2VNet: Vehicle-to-Vehicle Communication for Joint Perception and Prediction,” Aug. 2020, arXiv:2008.07519 [cs].
  19. Y. Li, S. Ren, P. Wu, S. Chen, C. Feng, and W. Zhang, “Learning Distilled Collaboration Graph for Multi-Agent Perception,” in Advances in Neural Information Processing Systems, vol. 34.   Curran Associates, Inc., 2021, pp. 29 541–29 552.
  20. R. Gong, W. Li, Y. Chen, and L. V. Gool, “DLOW: Domain Flow for Adaptation and Generalization,” 2019, pp. 2477–2486.
  21. M. Ghifary, W. B. Kleijn, M. Zhang, and D. Balduzzi, “Domain Generalization for Object Recognition With Multi-Task Autoencoders,” 2015, pp. 2551–2559.
  22. D. Li, J. Zhang, Y. Yang, C. Liu, Y.-Z. Song, and T. Hospedales, “Episodic Training for Domain Generalization,” in 2019 IEEE/CVF International Conference on Computer Vision (ICCV).   Seoul, Korea (South): IEEE, Oct. 2019, pp. 1446–1455.
  23. Y. Li, Y. Yang, W. Zhou, and T. Hospedales, “Feature-Critic Networks for Heterogeneous Domain Generalization,” in Proceedings of the 36th International Conference on Machine Learning.   PMLR, May 2019, pp. 3915–3924, iSSN: 2640-3498.
  24. S. Motiian, M. Piccirilli, D. A. Adjeroh, and G. Doretto, “Unified Deep Supervised Domain Adaptation and Generalization,” 2017, pp. 5715–5725.
  25. R. Volpi and V. Murino, “Addressing Model Vulnerability to Distributional Shifts Over Image Transformation Sets,” in 2019 IEEE/CVF International Conference on Computer Vision (ICCV).   Seoul, Korea (South): IEEE, Oct. 2019, pp. 7979–7988.
  26. C. Finn, P. Abbeel, and S. Levine, “Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks,” in Proceedings of the 34th International Conference on Machine Learning.   PMLR, Jul. 2017, pp. 1126–1135, iSSN: 2640-3498.
  27. A. Gretton, K. M. Borgwardt, M. Rasch, B. Schölkopf, and A. J. Smola, “A Kernel Method for the Two-Sample-Problem.”
  28. E. Reinhard, M. Adhikhmin, B. Gooch, and P. Shirley, “Color transfer between images,” IEEE Computer Graphics and Applications, vol. 21, no. 5, pp. 34–41, Jul. 2001, conference Name: IEEE Computer Graphics and Applications.
  29. A. Dosovitskiy, G. Ros, F. Codevilla, A. Lopez, and V. Koltun, “CARLA: An Open Urban Driving Simulator,” in Proceedings of the 1st Annual Conference on Robot Learning.   PMLR, Oct. 2017, pp. 1–16, iSSN: 2640-3498.
  30. R. Xu, Y. Guo, X. Han, X. Xia, H. Xiang, and J. Ma, “OpenCDA:An Open Cooperative Driving Automation Framework Integrated with Co-Simulation,” Aug. 2021, arXiv:2107.06260 [cs].
  31. S. G. Narasimhan and S. K. Nayar, “Vision and the atmosphere,” in ACM SIGGRAPH ASIA 2008 courses on - SIGGRAPH Asia ’08.   Singapore: ACM Press, 2008, pp. 1–22.
  32. K. Garg and S. K. Nayar, “Photorealistic rendering of rain streaks,” ACM Transactions on Graphics, vol. 25, no. 3, pp. 996–1002, Jul. 2006.
  33. J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros, “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks,” arXiv:1703.10593 [cs], Aug. 2020, arXiv: 1703.10593.
  34. V. F. Arruda, T. M. Paixão, R. F. Berriel, A. F. De Souza, C. Badue, N. Sebe, and T. Oliveira-Santos, “Cross-Domain Car Detection Using Unsupervised Image-to-Image Translation: From Day to Night,” in 2019 International Joint Conference on Neural Networks (IJCNN), Jul. 2019, pp. 1–8, arXiv:1907.08719 [cs].
  35. R. Xu, Z. Tu, H. Xiang, W. Shao, B. Zhou, and J. Ma, “CoBEVT: Cooperative Bird’s Eye View Semantic Segmentation with Sparse Transformers,” Sep. 2022, arXiv:2207.02202 [cs].
  36. B. Zhou and P. Krähenbühl, “Cross-view Transformers for real-time Map-view Semantic Segmentation,” arXiv, Tech. Rep. arXiv:2205.02833, May 2022, arXiv:2205.02833 [cs] type: article.
  37. L. Van der Maaten and G. Hinton, “Visualizing data using t-sne.” Journal of machine learning research, vol. 9, no. 11, 2008.
Citations (15)

Summary

We haven't generated a summary for this paper yet.