DeFlow: Decoder of Scene Flow Network in Autonomous Driving (2401.16122v1)
Abstract: Scene flow estimation determines a scene's 3D motion field, by predicting the motion of points in the scene, especially for aiding tasks in autonomous driving. Many networks with large-scale point clouds as input use voxelization to create a pseudo-image for real-time running. However, the voxelization process often results in the loss of point-specific features. This gives rise to a challenge in recovering those features for scene flow tasks. Our paper introduces DeFlow which enables a transition from voxel-based features to point features using Gated Recurrent Unit (GRU) refinement. To further enhance scene flow estimation performance, we formulate a novel loss function that accounts for the data imbalance between static and dynamic points. Evaluations on the Argoverse 2 scene flow task reveal that DeFlow achieves state-of-the-art results on large-scale point cloud data, demonstrating that our network has better performance and efficiency compared to others. The code is open-sourced at https://github.com/KTH-RPL/deflow.
- C. Luo, X. Yang, and A. Yuille, “Self-supervised pillar motion learning for autonomous driving,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 3183–3192.
- M. Najibi, J. Ji, Y. Zhou, C. R. Qi, X. Yan, S. Ettinger, and D. Anguelov, “Motion inspired unsupervised perception and prediction in autonomous driving,” in European Conference on Computer Vision. Springer, 2022, pp. 424–443.
- L. Schmid, O. Andersson, A. Sulser, P. Pfreundschuh, and R. Siegwart, “Dynablox: Real-time detection of diverse dynamic objects in complex environments,” arXiv preprint arXiv:2304.10049, 2023.
- I. Lang, D. Aiger, F. Cole, S. Avidan, and M. Rubinstein, “Scoop: Self-supervised correspondence and optimization-based scene flow,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 5281–5290.
- Y. Wei, Z. Wang, Y. Rao, J. Lu, and J. Zhou, “PV-RAFT: Point-Voxel Correlation Fields for Scene Flow Estimation of Point Clouds,” in CVPR, 2021.
- Z. Wang, Y. Wei, Y. Rao, J. Zhou, and J. Lu, “3d point-voxel correlation fields for scene flow estimation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023.
- A. X. Chang, T. Funkhouser, L. Guibas, P. Hanrahan, Q. Huang, Z. Li, S. Savarese, M. Savva, S. Song, H. Su et al., “Shapenet: An information-rich 3d model repository,” arXiv preprint arXiv:1512.03012, 2015.
- N. Mayer, E. Ilg, P. Hausser, P. Fischer, D. Cremers, A. Dosovitskiy, and T. Brox, “A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 4040–4048.
- P. Jund, C. Sweeney, N. Abdo, Z. Chen, and J. Shlens, “Scalable scene flow from point clouds in the real world,” IEEE Robotics and Automation Letters, vol. 7, no. 2, pp. 1589–1596, 2021.
- M. Menze and A. Geiger, “Object scene flow for autonomous vehicles,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 3061–3070.
- B. Wilson, W. Qi, T. Agarwal, J. Lambert, J. Singh, S. Khandelwal, B. Pan, R. Kumar, A. Hartnett, J. K. Pontes, D. Ramanan, P. Carr, and J. Hays, “Argoverse 2: Next generation datasets for self-driving perception and forecasting,” in Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks (NeurIPS Datasets and Benchmarks 2021), 2021.
- X. Li, J. Kaesemodel Pontes, and S. Lucey, “Neural scene flow prior,” Advances in Neural Information Processing Systems, vol. 34, pp. 7838–7851, 2021.
- N. Chodosh, D. Ramanan, and S. Lucey, “Re-evaluating lidar scene flow for autonomous driving,” arXiv preprint arXiv:2304.02150, 2023.
- K. Vedder, N. Peri, N. Chodosh, I. Khatri, E. Eaton, D. Jayaraman, Y. Liu, D. Ramanan, and J. Hays, “Zeroflow: Fast zero label scene flow via distillation,” arXiv preprint arXiv:2305.10424, 2023.
- C. R. Qi, H. Su, K. Mo, and L. J. Guibas, “Pointnet: Deep learning on point sets for 3d classification and segmentation,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 652–660.
- Y. Zhou, P. Sun, Y. Zhang, D. Anguelov, J. Gao, T. Ouyang, J. Guo, J. Ngiam, and V. Vasudevan, “End-to-end multi-view fusion for 3d object detection in lidar point clouds,” in Conference on Robot Learning. PMLR, 2020, pp. 923–932.
- A. H. Lang, S. Vora, H. Caesar, L. Zhou, J. Yang, and O. Beijbom, “Pointpillars: Fast encoders for object detection from point clouds,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 12 697–12 705.
- S. Vedula, P. Rander, R. Collins, and T. Kanade, “Three-dimensional scene flow,” IEEE transactions on pattern analysis and machine intelligence, vol. 27, no. 3, pp. 475–480, 2005.
- Z. Lu and M. Cheng, “Gma3d: Local-global attention learning to estimate occluded motions of scene flow,” arXiv preprint arXiv:2210.03296, 2022.
- W. Cheng and J. H. Ko, “Bi-pointflownet: Bidirectional learning for point cloud based scene flow estimation,” in European Conference on Computer Vision. Springer, 2022, pp. 108–124.
- Z. Teed and J. Deng, “Raft: Recurrent all-pairs field transforms for optical flow,” in Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part II 16. Springer, 2020, pp. 402–419.
- H. Xu, J. Zhang, J. Cai, H. Rezatofighi, and D. Tao, “Gmflow: Learning optical flow via global matching,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022, pp. 8121–8130.
- X. Sui, S. Li, X. Geng, Y. Wu, X. Xu, Y. Liu, R. Goh, and H. Zhu, “Craft: Cross-attentional flow transformer for robust optical flow,” in Proceedings of the IEEE/CVF conference on Computer Vision and Pattern Recognition, 2022, pp. 17 602–17 611.
- Y. Hou, X. Zhu, Y. Ma, C. C. Loy, and Y. Li, “Point-to-voxel knowledge distillation for lidar semantic segmentation,” in IEEE Conference on Computer Vision and Pattern Recognition, 2022, pp. 8479–8488.
- J. Cen, P. Yun, S. Zhang, J. Cai, D. Luan, M. Tang, M. Liu, and M. Yu Wang, “Open-world semantic segmentation for lidar point clouds,” in Computer Vision – ECCV 2022. Cham: Springer Nature Switzerland, 2022, pp. 318–334.
- A. Dosovitskiy, P. Fischer, E. Ilg, P. Hausser, C. Hazirbas, V. Golkov, P. Van Der Smagt, D. Cremers, and T. Brox, “Flownet: Learning optical flow with convolutional networks,” in Proceedings of the IEEE international conference on computer vision, 2015, pp. 2758–2766.
- D. Sun, X. Yang, M.-Y. Liu, and J. Kautz, “Pwc-net: Cnns for optical flow using pyramid, warping, and cost volume,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 8934–8943.
- F. Yu and V. Koltun, “Multi-scale context aggregation by dilated convolutions,” arXiv preprint arXiv:1511.07122, 2015.
- J. Hur and S. Roth, “Iterative residual refinement for joint optical flow and occlusion estimation,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 5754–5763.
- F. Zhang, O. J. Woodford, V. A. Prisacariu, and P. H. Torr, “Separable flow: Learning motion cost volumes for optical flow estimation,” in Proceedings of the IEEE/CVF international conference on computer vision, 2021, pp. 10 807–10 817.
- P. Sun, H. Kretzschmar, X. Dotiwalla, A. Chouard, V. Patnaik, P. Tsui, J. Guo, Y. Zhou, Y. Chai, B. Caine et al., “Scalability in perception for autonomous driving: Waymo open dataset,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 2446–2454.
- S. Lee, H. Lim, and H. Myung, “Patchwork++: Fast and robust ground segmentation solving partial under-segmentation using 3D point cloud,” in Proc. IEEE/RSJ Int. Conf. Intell. Robots Syst., 2022, pp. 13 276–13 283.
- X. Li, J. Zheng, F. Ferroni, J. K. Pontes, and S. Lucey, “Fast neural scene flow,” arXiv preprint arXiv:2304.09121, 2023.
- A. 2, “Argoverse 2 scene flow online leaderboard,” https://eval.ai/web/challenges/challenge-page/2010/leaderboard/4759, 2023 Sep 14th.
- Q. Zhang, D. Duberg, R. Geng, M. Jia, L. Wang, and P. Jensfelt, “A dynamic points removal benchmark in point cloud maps,” arXiv preprint arXiv:2307.07260, 2023.