Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Instantaneous Perception of Moving Objects in 3D (2405.02781v1)

Published 5 May 2024 in cs.CV

Abstract: The perception of 3D motion of surrounding traffic participants is crucial for driving safety. While existing works primarily focus on general large motions, we contend that the instantaneous detection and quantification of subtle motions is equally important as they indicate the nuances in driving behavior that may be safety critical, such as behaviors near a stop sign of parking positions. We delve into this under-explored task, examining its unique challenges and developing our solution, accompanied by a carefully designed benchmark. Specifically, due to the lack of correspondences between consecutive frames of sparse Lidar point clouds, static objects might appear to be moving - the so-called swimming effect. This intertwines with the true object motion, thereby posing ambiguity in accurate estimation, especially for subtle motions. To address this, we propose to leverage local occupancy completion of object point clouds to densify the shape cue, and mitigate the impact of swimming artifacts. The occupancy completion is learned in an end-to-end fashion together with the detection of moving objects and the estimation of their motion, instantaneously as soon as objects start to move. Extensive experiments demonstrate superior performance compared to standard 3D motion estimation approaches, particularly highlighting our method's specialized treatment of subtle motions.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (60)
  1. Implicit occupancy flow fields for perception and prediction in self-driving. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1379–1388, 2023.
  2. A fast voxel traversal algorithm for ray tracing. In Eurographics, pages 3–10. Citeseer, 1987.
  3. Pointflownet: Learning representations for rigid motion estimation from point clouds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019.
  4. Semantickitti: A dataset for semantic scene understanding of lidar sequences. In ICCV, 2019.
  5. Method for registration of 3-d shapes. In Sensor fusion IV: control paradigms and data structures, pages 586–606. Spie, 1992.
  6. Also: Automotive lidar self-supervision by occupancy estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023.
  7. Deeprecon: Joint 2d cardiac segmentation and 3d volume reconstruction via a structure-specific generative method. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 567–577. Springer, 2022.
  8. Re-evaluating lidar scene flow for autonomous driving. arXiv preprint arXiv:2304.02150, 2023.
  9. The implicit values of a good hand shake: Handheld multi-frame neural depth refinement. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2852–2862, 2022.
  10. Hidden gems: 4d radar scene flow learning using cross-modal supervision. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023.
  11. Flownet: Learning optical flow with convolutional networks. In Proceedings of the IEEE international conference on computer vision, 2015.
  12. 3d motion magnification: Visualizing subtle motions from time-varying radiance fields. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 9837–9846, 2023.
  13. Any motion detector: Learning class-agnostic scene dynamics from a sequence of lidar point clouds. In 2020 IEEE international conference on robotics and automation (ICRA), pages 9498–9504. IEEE, 2020.
  14. A data-scalable transformer for medical image segmentation: architecture, model efficiency, and benchmark. arXiv preprint arXiv:2203.00131, 2022.
  15. Training like a medical resident: universal medical image segmentation via context prior learning. arXiv preprint arXiv:2306.02416, 2023.
  16. High-quality depth from uncalibrated small motion clip. In Proceedings of the IEEE conference on computer vision and pattern Recognition, pages 5413–5421, 2016.
  17. Proxedit: Improving tuning-free real image editing with proximal guidance. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 4291–4301, 2024.
  18. Dealing with heterogeneous 3d mr knee images: A federated few-shot learning method with dual knowledge distillation. In 2023 IEEE 20th International Symposium on Biomedical Imaging (ISBI), pages 1–5. IEEE, 2023.
  19. Delving into motion-aware matching for monocular 3d object tracking. In ICCV, 2023.
  20. Dynamic 3d scene analysis by point cloud accumulation. In European Conference on Computer Vision, 2022.
  21. Point cloud forecasting as a proxy for 4d occupancy forecasting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023.
  22. Pillarflow: End-to-end birds-eye-view flow estimation for autonomous driving. In 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 2007–2013. IEEE, 2020.
  23. Neural scene flow prior. Advances in Neural Information Processing Systems, 34:7838–7851, 2021.
  24. Fast neural scene flow. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023.
  25. Steering prototypes with prompt-tuning for rehearsal-free continual learning. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 2523–2533, 2024.
  26. Motion magnification. ACM transactions on graphics (TOG), 24(3):519–526, 2005.
  27. Label super resolution for 3d magnetic resonance images using deformable u-net. In Medical Imaging 2021: Image Processing, pages 606–611. SPIE, 2021a.
  28. Refined deep layer aggregation for multi-disease, multi-view & multi-center cardiac mr segmentation. In International Workshop on Statistical Atlases and Computational Models of the Heart, pages 315–322. Springer, 2021b.
  29. Transfusion: multi-view divergent fusion for medical image segmentation with transformers. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 485–495. Springer, 2022.
  30. Deformer: Integrating transformers with deformable models for 3d shape abstraction from a single image. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 14236–14246, 2023a.
  31. Deep deformable models: Learning 3d shape abstractions with part consistency. arXiv preprint arXiv:2309.01035, 2023b.
  32. Lepard: Learning explicit part discovery for 3d articulated shape reconstruction. Advances in Neural Information Processing Systems, 36, 2024.
  33. Flownet3d: Learning scene flow in 3d point clouds. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 529–537, 2019.
  34. Bevfusion: Multi-task multi-sensor fusion with unified bird’s-eye view representation. In 2023 IEEE International Conference on Robotics and Automation (ICRA), 2023c.
  35. Linear differential algorithm for motion recovery: A geometric approach. International Journal of Computer Vision, 36:71–89, 2000.
  36. Occupancy flow fields for motion forecasting in autonomous driving. IEEE Robotics and Automation Letters, 7(2):5639–5646, 2022.
  37. Deep learning segmentation of the right ventricle in cardiac mri: The m&ms challenge. IEEE Journal of Biomedical and Health Informatics, 2023.
  38. Receding moving object segmentation in 3d lidar data using sparse 4d convolutions. IEEE Robotics and Automation Letters, 7(3):7503–7510, 2022.
  39. Building volumetric beliefs for dynamic environments exploiting map-based moving object segmentation. IEEE Robotics and Automation Letters, 2023.
  40. Neurocs: Neural nocs supervision for monocular 3d object localization. In CVPR, 2023.
  41. Just go with the flow: Self-supervised scene flow estimation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 11177–11185, 2020.
  42. Deepsdf: Learning continuous signed distance functions for shape representation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 165–174, 2019.
  43. Efficient variants of the icp algorithm. In Proceedings third international conference on 3-D digital imaging and modeling, 2001.
  44. Generalized-icp. In Robotics: science and systems, 2009.
  45. Efficient spatial-temporal information fusion for lidar-based 3d moving object segmentation. In 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 11456–11463. IEEE, 2022.
  46. Scalability in perception for autonomous driving: Waymo open dataset. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020.
  47. Scene as occupancy. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 8406–8415, 2023.
  48. Multi-body neural scene flow. 2023.
  49. Pointmotionnet: Point-wise motion learning for large-scale lidar point clouds sequences. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, pages 4419–4428, 2022.
  50. Openoccupancy: A large scale benchmark for surrounding semantic occupancy perception. arXiv preprint arXiv:2303.03991, 2023.
  51. Surroundocc: Multi-camera 3d occupancy prediction for autonomous driving. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 21729–21740, 2023.
  52. Second-order graph odes for multi-agent trajectory forecasting. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 5101–5110, 2024.
  53. Moving event detection from lidar point streams. Nature Communications, 2024.
  54. Center-based 3d object detection and tracking. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021.
  55. 3d reconstruction from accidental motion. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 3986–3993, 2014.
  56. End-to-end interpretable neural motion planner. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8660–8669, 2019.
  57. Sequential multi-view fusion network for fast lidar point motion estimation. In European Conference on Computer Vision, 2022.
  58. Region proposal rectification towards robust instance segmentation of biological images. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 129–139. Springer, 2022.
  59. Open3D: A modern library for 3D data processing. arXiv:1801.09847, 2018.
  60. Rolling-shutter-aware differential sfm and image rectification. In ICCV, 2017.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com