Papers
Topics
Authors
Recent
2000 character limit reached

LiDAR Data Synthesis with Denoising Diffusion Probabilistic Models (2309.09256v2)

Published 17 Sep 2023 in cs.CV and cs.RO

Abstract: Generative modeling of 3D LiDAR data is an emerging task with promising applications for autonomous mobile robots, such as scalable simulation, scene manipulation, and sparse-to-dense completion of LiDAR point clouds. While existing approaches have demonstrated the feasibility of image-based LiDAR data generation using deep generative models, they still struggle with fidelity and training stability. In this work, we present R2DM, a novel generative model for LiDAR data that can generate diverse and high-fidelity 3D scene point clouds based on the image representation of range and reflectance intensity. Our method is built upon denoising diffusion probabilistic models (DDPMs), which have shown impressive results among generative model frameworks in recent years. To effectively train DDPMs in the LiDAR domain, we first conduct an in-depth analysis of data representation, loss functions, and spatial inductive biases. Leveraging our R2DM model, we also introduce a flexible LiDAR completion pipeline based on the powerful capabilities of DDPMs. We demonstrate that our method surpasses existing methods in generating tasks on the KITTI-360 and KITTI-Raw datasets, as well as in the completion task on the KITTI-360 dataset. Our project page can be found at https://kazuto1011.github.io/r2dm.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (41)
  1. Y. Li, L. Ma, Z. Zhong, F. Liu, M. A. Chapman, D. Cao, and J. Li, “Deep learning for LiDAR point clouds in autonomous driving: A review,” IEEE Transactions on Neural Networks and Learning Systems, vol. 32, no. 8, pp. 3412–3432, 2020.
  2. S. Bond-Taylor, A. Leach, Y. Long, and C. G. Willcocks, “Deep generative modelling: A comparative review of VAEs, GANs, normalizing flows, energy-based and autoregressive models,” IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), vol. 44, no. 11, pp. 7327–7347, 2022.
  3. L. Caccia, H. van Hoof, A. Courville, and J. Pineau, “Deep generative modeling of LiDAR data,” in Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 5034–5040, 2019.
  4. K. Nakashima and R. Kurazume, “Learning to drop points for LiDAR scan synthesis,” in Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 222–229, 2021.
  5. K. Nakashima, Y. Iwashita, and R. Kurazume, “Generative range imaging for learning scene priors of 3D LiDAR data,” in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pp. 1256–1266, 2023.
  6. V. Zyrianov, X. Zhu, and S. Wang, “Learning to generate realistic LiDAR point clouds,” in Proceedings of the European Conference on Computer Vision (ECCV), pp. 17–35, 2022.
  7. D. P. Kingma and M. Welling, “Auto-encoding variational bayes,” in Proceedings of the International Conference on Learning Representations (ICLR), 2014.
  8. I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial nets,” in Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), pp. 2672–2680, 2014.
  9. Y. Song and S. Ermon, “Generative modeling by estimating gradients of the data distribution,” in Proceedings of the Advances in neural information processing systems (NeurIPS), pp. 11895–11907, 2019.
  10. Y. Song and S. Ermon, “Improved techniques for training score-based generative models,” in Proceedings of the Advances in neural information processing systems (NeurIPS), vol. 33, pp. 12438–12448, 2020.
  11. Y. Song, J. Sohl-Dickstein, D. P. Kingma, A. Kumar, S. Ermon, and B. Poole, “Score-based generative modeling through stochastic differential equations,” in Proceedings of the International Conference on Learning Representations (ICLR), 2021.
  12. J. Ho, A. Jain, and P. Abbeel, “Denoising diffusion probabilistic models,” in Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), vol. 33, pp. 6840–6851, 2020.
  13. A. Q. Nichol and P. Dhariwal, “Improved denoising diffusion probabilistic models,” in Proceedings of the International Conference on Machine Learning (ICML), pp. 8162–8171, 2021.
  14. D. Kingma, T. Salimans, B. Poole, and J. Ho, “Variational diffusion models,” in Proceedings of the Advances in neural information processing systems (NeurIPS), vol. 34, pp. 21696–21707, 2021.
  15. C. Saharia, W. Chan, S. Saxena, L. Li, J. Whang, E. L. Denton, K. Ghasemipour, R. Gontijo Lopes, B. Karagol Ayan, T. Salimans, et al., “Photorealistic text-to-image diffusion models with deep language understanding,” in Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), vol. 35, pp. 36479–36494, 2022.
  16. Y. Liao, J. Xie, and A. Geiger, “KITTI-360: A novel dataset and benchmarks for urban scene understanding in 2D and 3D,” IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), vol. 45, no. 3, pp. 3292–3310, 2022.
  17. A. Geiger, P. Lenz, C. Stiller, and R. Urtasun, “Vision meets robotics: The KITTI dataset,” The International Journal of Robotics Research (IJRR), vol. 32, no. 11, pp. 1231–1237, 2013.
  18. A. Lugmayr, M. Danelljan, A. Romero, F. Yu, R. Timofte, and L. Van Gool, “RePaint: Inpainting using denoising diffusion probabilistic models,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 11461–11471, 2022.
  19. E. Hoogeboom, J. Heek, and T. Salimans, “Simple diffusion: end-to-end diffusion for high resolution images,” in Proceedings of the International Conference on Machine Learning (ICML), pp. 13213–13232, 2023.
  20. O. Ronneberger, P. Fischer, and T. Brox, “U-Net: Convolutional networks for biomedical image segmentation,” in Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), pp. 234–241, 2015.
  21. S. Saxena, A. Kar, M. Norouzi, and D. J. Fleet, “Monocular depth estimation using diffusion models,” arXiv:2302.14816, 2023.
  22. R. Xu, X. Wang, K. Chen, B. Zhou, and C. C. Loy, “Positional encoding as spatial inductive bias in gans,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 13569–13578, 2021.
  23. J. Choi, J. Lee, Y. Jeong, and S. Yoon, “Toward spatially unbiased generative models,” in Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 14253–14262, 2021.
  24. D. Verbin, P. Hedman, B. Mildenhall, T. Zickler, J. T. Barron, and P. P. Srinivasan, “Ref-NeRF: Structured view-dependent appearance for neural radiance fields,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5481–5490, 2022.
  25. K. Zhang, Z. Hong, S. Xu, and S. Wang, “CURL: Continuous, ultra-compact representation for LiDAR,” in Proceedings of the Robotics: Science and Systems (RSS), 2022.
  26. M. Tancik, P. Srinivasan, B. Mildenhall, S. Fridovich-Keil, N. Raghavan, U. Singhal, R. Ramamoorthi, J. Barron, and R. Ng, “Fourier features let networks learn high frequency functions in low dimensional domains,” in Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), vol. 33, pp. 7537–7547, 2020.
  27. B. Mildenhall, P. P. Srinivasan, M. Tancik, J. T. Barron, R. Ramamoorthi, and R. Ng, “NeRF: Representing scenes as neural radiance fields for view synthesis,” in Proceedings of the European Conference on Computer Vision (ECCV), pp. 405–421, 2020.
  28. P. Dhariwal and A. Nichol, “Diffusion models beat gans on image synthesis,” in Proceedings of the Advances in neural information processing systems (NeurIPS), vol. 34, pp. 8780–8794, 2021.
  29. K. Nakashima, H. Jung, Y. Oto, Y. Iwashita, R. Kurazume, and O. M. Mozos, “Learning geometric and photometric features from panoramic LiDAR scans for outdoor place categorization,” Advanced Robotics, vol. 32, no. 14, pp. 750–765, 2018.
  30. S. Schubert, P. Neubert, J. Pöschmann, and P. Protzel, “Circular convolutional neural networks for panoramic images and laser data,” in Proceedings of the IEEE Intelligent Vehicles Symposium (IV), pp. 653–660, 2019.
  31. G. Lin, A. Milan, C. Shen, and I. Reid, “Refinenet: Multi-path refinement networks for high-resolution semantic segmentation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1925–1934, 2017.
  32. L. T. Triess, D. Peter, C. B. Rist, and J. M. Zöllner, “Scan-based semantic segmentation of LiDAR point clouds: An experimental study,” in Proceedings of the IEEE Intelligent Vehicles Symposium (IV), pp. 1116–1121, 2020.
  33. A. Milioto, I. Vizzo, J. Behley, and C. Stachniss, “RangeNet++: Fast and accurate LiDAR semantic segmentation,” in Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4213–4220, 2019.
  34. J. Behley, M. Garbade, A. Milioto, J. Quenzel, S. Behnke, C. Stachniss, and J. Gall, “SemanticKITTI: A dataset for semantic scene understanding of LiDAR sequences,” in Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 9297–9307, 2019.
  35. D. C. Dowson and B. V. Landau, “The Fréchet distance between multivariate normal distributions,” Journal of Multivariate Analysis, vol. 12, no. 3, pp. 450–455, 1982.
  36. D. W. Shu, S. W. Park, and J. Kwon, “3D point cloud generative adversarial network based on tree structured graph convolutions,” in Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 3859–3868, 2019.
  37. C. R. Qi, H. Su, K. Mo, and L. J. Guibas, “PointNet: Deep learning on point sets for 3D classification and segmentation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 652–660, 2017.
  38. A. X. Chang, T. Funkhouser, L. Guibas, P. Hanrahan, Q. Huang, Z. Li, S. Savarese, M. Savva, S. Song, H. Su, J. Xiao, L. Yi, and F. Yu, “ShapeNet: An information-rich 3D model repository,” arXiv:1512.03012, 2015.
  39. T. Shan, J. Wang, F. Chen, P. Szenher, and B. Englot, “Simulation-based lidar super-resolution for ground vehicles,” Robotics and Autonomous Systems (RAS), vol. 134, p. 103647, 2020.
  40. Y. Kwon, M. Sung, and S.-E. Yoon, “Implicit LiDAR network: Lidar super-resolution via interpolation weight prediction,” in Proceedings of the International Conference on Robotics and Automation (ICRA), pp. 8424–8430, 2022.
  41. A. Dosovitskiy, G. Ros, F. Codevilla, A. Lopez, and V. Koltun, “CARLA: An open urban driving simulator,” in Proceedings of the Conference on Robot Learning (CoRL), pp. 1–16, 2017.
Citations (14)

Summary

We haven't generated a summary for this paper yet.

Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.