Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
156 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

AlignMiF: Geometry-Aligned Multimodal Implicit Field for LiDAR-Camera Joint Synthesis (2402.17483v1)

Published 27 Feb 2024 in cs.CV

Abstract: Neural implicit fields have been a de facto standard in novel view synthesis. Recently, there exist some methods exploring fusing multiple modalities within a single field, aiming to share implicit features from different modalities to enhance reconstruction performance. However, these modalities often exhibit misaligned behaviors: optimizing for one modality, such as LiDAR, can adversely affect another, like camera performance, and vice versa. In this work, we conduct comprehensive analyses on the multimodal implicit field of LiDAR-camera joint synthesis, revealing the underlying issue lies in the misalignment of different sensors. Furthermore, we introduce AlignMiF, a geometrically aligned multimodal implicit field with two proposed modules: Geometry-Aware Alignment (GAA) and Shared Geometry Initialization (SGI). These modules effectively align the coarse geometry across different modalities, significantly enhancing the fusion process between LiDAR and camera data. Through extensive experiments across various datasets and scenes, we demonstrate the effectiveness of our approach in facilitating better interaction between LiDAR and camera modalities within a unified neural field. Specifically, our proposed AlignMiF, achieves remarkable improvement over recent implicit fusion methods (+2.01 and +3.11 image PSNR on the KITTI-360 and Waymo datasets) and consistently surpasses single modality performance (13.8% and 14.2% reduction in LiDAR Chamfer Distance on the respective datasets).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (64)
  1. Transfusion: Robust lidar-camera fusion for 3d object detection with transformers. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 1090–1099, 2022.
  2. Mip-nerf: A multiscale representation for anti-aliasing neural radiance fields. In ICCV, 2021.
  3. Zip-nerf: Anti-aliased grid-based neural radiance fields. ICCV, 2023.
  4. Cloner: Camera-lidar fusion for occupancy grid-aided neural representations. IEEE Robotics and Automation Letters, 2023.
  5. Tensorf: Tensorial radiance fields. In ECCV, 2022.
  6. Efficient graphics representation with differentiable indirection. arXiv:2309.08387, 2023.
  7. Depth-supervised nerf: Fewer views and faster training for free. In CVPR, 2022.
  8. Carla: An open urban driving simulator. In Conference on robot learning, 2017.
  9. One is all: Bridging the gap between neural radiance fields architectures with progressive volume distillation. In AAAI, 2023.
  10. Panoptic nerf: 3d-to-2d label transfer for panoptic urban scene segmentation. arXiv:2203.15224, 2022.
  11. Streetsurf: Extending multi-view implicit surface reconstruction to street views. arXiv:2306.04988, 2023.
  12. Dynamic neural networks: A survey. TPAMI, 2021.
  13. Tri-miprf: Tri-mip representation for efficient anti-aliasing neural radiance fields. In ICCV, 2023a.
  14. Pc-nerf: Parent-child neural radiance fields under partial sensor data loss in autonomous driving environments. arXiv:2310.00874, 2023b.
  15. Neural lidar fields for novel view synthesis. ICCV, 2023a.
  16. Local implicit ray function for generalizable radiance field representation. In CVPR, 2023b.
  17. Adam: A method for stochastic optimization. arXiv:1412.6980, 2014.
  18. Corresnerf: Image correspondence priors for neural radiance fields. In NeurIPS, 2023.
  19. Efficient region-aware neural radiance fields for high-fidelity talking portrait synthesis. In ICCV, 2023a.
  20. Read: Large-scale neural scene rendering for autonomous driving. In AAAI, 2023b.
  21. Neuralangelo: High-fidelity neural surface reconstruction. In CVPR, 2023c.
  22. Hr-neus: Recovering high-frequency surface geometry via neural implicit surfaces. arXiv:2302.06793, 2023.
  23. Kitti-360: A novel dataset and benchmarks for urban scene understanding in 2d and 3d. TPAMI, 2022.
  24. Urban radiance field representation with deformable neural mesh primitives. In ICCV, 2023.
  25. Progressively optimized local radiance fields for robust view synthesis. In CVPR, 2023.
  26. Nerf: Representing scenes as neural radiance fields for view synthesis. Communications of the ACM, 2021.
  27. Instant neural graphics primitives with a multiresolution hash encoding. ToG, 2022.
  28. Donerf: Towards real-time rendering of compact neural radiance fields using depth oracle networks. In CGF, 2021.
  29. Neural scene graphs for dynamic scenes. In CVPR, 2021.
  30. Balanced multimodal learning via on-the-fly gradient modulation. In CVPR, 2022.
  31. Urban radiance fields. In CVPR, 2022.
  32. Dense depth priors for neural radiance fields from sparse input views. In CVPR, 2022.
  33. Permutosdf: Fast multi-view reconstruction with implicit surfaces using permutohedral lattices. In CVPR, 2023.
  34. Andrew Sanders. An introduction to Unreal engine 4. AK Peters/CRC Press, 2016.
  35. Airsim: High-fidelity visual and physical simulation for autonomous vehicles. In Field and Service Robotics: Results of the 11th International Conference, pages 621–635. Springer, 2018.
  36. Panoptic lifting for 3d scene understanding with neural fields. In CVPR, 2023.
  37. Pag-nerf: Towards fast and efficient end-to-end panoptic 3d representations for agricultural robotics. arXiv:2309.05339, 2023.
  38. Scalability in perception for autonomous driving: Waymo open dataset. In CVPR, 2020.
  39. Learning to balance the learning rates between various modalities via adaptive tracking factor. SPL, 2021.
  40. Block-nerf: Scalable large scene neural view synthesis. In CVPR, 2022.
  41. Nerfstudio: A modular framework for neural radiance field development. In SIGGRAPH, 2023.
  42. Lidar-nerf: Novel lidar view synthesis via neural radiance fields. arXiv:2304.10406, 2023.
  43. Suds: Scalable urban dynamic scenes. In CVPR, 2023.
  44. Digging into depth priors for outdoor neural radiance fields. arXiv:2308.04413, 2023a.
  45. F2-nerf: Fast neural radiance field training with free camera trajectories. In CVPR, 2023b.
  46. What makes training multi-modal classification networks hard? In CVPR, 2020.
  47. Image quality assessment: from error visibility to structural similarity. TIP, 2004.
  48. Guan Cheng Lee Wei Jong Yang. Addressing data misalignment in image-lidar fusion on point cloud segmentation. arXiv:2309.14932, 2023.
  49. All-in-one drive: A comprehensive perception dataset with high-density long-range point clouds. 2023.
  50. Mars: An instance-aware, modular and realistic simulator for autonomous driving. arXiv:2307.15058, 2023.
  51. Hollownerf: Pruning hashgrid-based nerfs with trainable collision mitigation. In ICCV, 2023a.
  52. S-nerf: Neural radiance fields for street views. arXiv:2303.00749, 2023b.
  53. Emernerf: Emergent spatial-temporal scene decomposition via self-supervision. arXiv:2311.02077, 2023a.
  54. Urbangiraffe: Representing urban scenes as compositional generative neural feature fields. arXiv:2303.14167, 2023b.
  55. Unisim: A neural closed-loop sensor simulator. In CVPR, 2023c.
  56. Plenoxels: Radiance fields without neural networks. arXiv:2112.05131, 2021a.
  57. pixelnerf: Neural radiance fields from one or few images. In CVPR, 2021b.
  58. Benchmarking the robustness of lidar-camera fusion for 3d object detection. CVPRW, 2022a.
  59. Monosdf: Exploring monocular geometric cues for neural implicit surface reconstruction. NeurIPS, 2022b.
  60. Nerf-lidar: Generating realistic lidar point clouds with neural radiance fields. ICCV, 2023.
  61. The unreasonable effectiveness of deep features as a perceptual metric. In CVPR, 2018.
  62. Multi-task view synthesis with neural radiance fields. In ICCV, 2023a.
  63. Neuralpci: Spatio-temporal neural field for 3d point cloud multi-frame non-linear interpolation. In CVPR, 2023b.
  64. In-place scene labelling and understanding with implicit scene representation. In ICCV, 2021.
Citations (5)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com