Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
126 tokens/sec
GPT-4o
28 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

DifFlow3D: Toward Robust Uncertainty-Aware Scene Flow Estimation with Diffusion Model (2311.17456v4)

Published 29 Nov 2023 in cs.CV

Abstract: Scene flow estimation, which aims to predict per-point 3D displacements of dynamic scenes, is a fundamental task in the computer vision field. However, previous works commonly suffer from unreliable correlation caused by locally constrained searching ranges, and struggle with accumulated inaccuracy arising from the coarse-to-fine structure. To alleviate these problems, we propose a novel uncertainty-aware scene flow estimation network (DifFlow3D) with the diffusion probabilistic model. Iterative diffusion-based refinement is designed to enhance the correlation robustness and resilience to challenging cases, e.g. dynamics, noisy inputs, repetitive patterns, etc. To restrain the generation diversity, three key flow-related features are leveraged as conditions in our diffusion model. Furthermore, we also develop an uncertainty estimation module within diffusion to evaluate the reliability of estimated scene flow. Our DifFlow3D achieves state-of-the-art performance, with 24.0% and 29.1% EPE3D reduction respectively on FlyingThings3D and KITTI 2015 datasets. Notably, our method achieves an unprecedented millimeter-level accuracy (0.0078m in EPE3D) on the KITTI dataset. Additionally, our diffusion-based refinement paradigm can be readily integrated as a plug-and-play module into existing scene flow networks, significantly increasing their estimation accuracy. Codes are released at https://github.com/IRMVLab/DifFlow3D.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (55)
  1. Slim: Self-supervised lidar scene flow and motion segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 13126–13136, 2021.
  2. Learning the distribution of errors in stereo matching for joint disparity and uncertainty estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 17235–17244, 2023.
  3. Deep stereo using adaptive thin volume representation with uncertainty awareness. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2524–2534, 2020.
  4. Bi-pointflownet: Bidirectional learning for point cloud based scene flow estimation. In European Conference on Computer Vision, pages 108–124. Springer, 2022.
  5. Multi-scale bidirectional recurrent network with hybrid correlation for point cloud based scene flow estimation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 10041–10050, 2023.
  6. Long-term visual simultaneous localization and mapping: Using a bayesian persistence filter-based global map prediction. IEEE Robotics & Automation Magazine, 30(1):36–49, 2023.
  7. Structure and content-guided video synthesis with diffusion models. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 7346–7356, 2023.
  8. Pt-flownet: Scene flow estimation on point clouds with point transformer. IEEE Robotics and Automation Letters, 8(5):2566–2573, 2023.
  9. Hplflownet: Hierarchical permutohedral lattice flownet for scene flow estimation on large-scale point clouds. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 3254–3263, 2019.
  10. Rcp: recurrent closest point for point cloud. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8216–8226, 2022.
  11. Kinecting the dots: Particle based scene flow from depth sensors. In 2011 International Conference on Computer Vision, pages 2290–2295. IEEE, 2011.
  12. Denoising diffusion probabilistic models. Advances in Neural Information Processing Systems, 33:6840–6851, 2020.
  13. A variational method for scene flow estimation from stereo sequences. In 2007 IEEE 11th International Conference on Computer Vision, pages 1–7. IEEE, 2007.
  14. 3d scene flow estimation on pseudo-lidar: Bridging the gap on estimating point motion. IEEE Transactions on Industrial Informatics, 2022.
  15. 3dsflabelling: Boosting 3d scene flow estimation by pseudo auto-labelling. arXiv preprint arXiv:2402.18146, 2024.
  16. Flowstep3d: Model unrolling for self-supervised scene flow estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4114–4123, 2021.
  17. Scoop: Self-supervised correspondence and optimization-based scene flow. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5281–5290, 2023.
  18. Hcrf-flow: Scene flow from point clouds with continuous high-order crfs and position-aware flow embedding. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 364–373, 2021.
  19. Rigidflow: Self-supervised scene flow learning on point clouds by local rigidity prior. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16959–16968, 2022.
  20. Translo: A window-based masked point transformer framework for large-scale lidar odometry. In Proceedings of the AAAI Conference on Artificial Intelligence, pages 1683–1691, 2023a.
  21. Regformer: an efficient projection-aware transformer network for large-scale point cloud registration. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 8451–8460, 2023b.
  22. Flownet3d: Learning scene flow in 3d point clouds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 529–537, 2019.
  23. Hregnet: A hierarchical network for large-scale outdoor lidar point cloud registration. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 16014–16023, 2021.
  24. Diffusion probabilistic models for 3d point cloud generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2837–2845, 2021.
  25. A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4040–4048, 2016.
  26. Object scene flow for autonomous vehicles. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3061–3070, 2015.
  27. Motion inspired unsupervised perception and prediction in autonomous driving. In Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXXVIII, pages 424–443. Springer, 2022.
  28. Point-e: A system for generating 3d point clouds from complex prompts. arXiv preprint arXiv:2212.08751, 2022.
  29. Occlusion guided scene flow estimation on 3d point clouds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2805–2814, 2021.
  30. Delflow: Dense efficient learning of scene flow for large-scale point clouds. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 16901–16910, 2023.
  31. Flot: Scene flow on point clouds guided by optimal transport. In European conference on computer vision, pages 527–544. Springer, 2020.
  32. Pointnet++: Deep hierarchical feature learning on point sets in a metric space. Advances in neural information processing systems, 30, 2017.
  33. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10684–10695, 2022.
  34. Photorealistic text-to-image diffusion models with deep language understanding. Advances in Neural Information Processing Systems, 35:36479–36494, 2022.
  35. Self-supervised 3d scene flow estimation guided by superpoints. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5271–5280, 2023.
  36. Deep unsupervised learning using nonequilibrium thermodynamics. In International Conference on Machine Learning, pages 2256–2265. PMLR, 2015.
  37. Denoising diffusion implicit models. arXiv preprint arXiv:2010.02502, 2020a.
  38. Score-based generative modeling through stochastic differential equations. arXiv preprint arXiv:2011.13456, 2020b.
  39. Pwc-net: Cnns for optical flow using pyramid, warping, and cost volume. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 8934–8943, 2018.
  40. Human motion diffusion model. arXiv preprint arXiv:2209.14916, 2022.
  41. Pdc-net+: Enhanced probabilistic dense correspondence network. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023.
  42. Lion: Latent point diffusion models for 3d shape generation. Advances in Neural Information Processing Systems, 35:10021–10039, 2022.
  43. Hierarchical attention learning of scene flow in 3d point clouds. IEEE Transactions on Image Processing, 30:5168–5181, 2021a.
  44. What matters for 3d scene flow network. In Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXXIII, pages 38–55. Springer, 2022a.
  45. Sfgan: Unsupervised generative adversarial learning of 3d scene flow from the 3d scene self. Advanced Intelligent Systems, 4(4):2100197, 2022b.
  46. Festa: Flow estimation via spatial-temporal attention for scene point clouds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 14173–14182, 2021b.
  47. Ihnet: Iterative hierarchical network guided by high-resolution estimated information for scene flow estimation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 10073–10082, 2023.
  48. Probflow: Joint optical flow and uncertainty estimation. In Proceedings of the IEEE international conference on computer vision, pages 1173–1182, 2017.
  49. Pv-raft: Point-voxel correlation fields for scene flow estimation of point clouds. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 6954–6963, 2021.
  50. Pointconv: Deep convolutional networks on 3d point clouds. In Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition, pages 9621–9630, 2019.
  51. Pointpwc-net: Cost volume on point clouds for (self-) supervised scene flow estimation. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part V 16, pages 88–107. Springer, 2020.
  52. Angular tracking consistency guided fast feature association for visual-inertial slam. IEEE Transactions on Instrumentation and Measurement, 2024.
  53. Video probabilistic diffusion models in projected latent space. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 18456–18466, 2023.
  54. 3d shape generation and completion through point-voxel diffusion. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 5826–5835, 2021.
  55. Sni-slam: Semantic neural implicit slam. arXiv preprint arXiv:2311.11016, 2023.
Citations (9)

Summary

We haven't generated a summary for this paper yet.