Depth-Guided Robust and Fast Point Cloud Fusion NeRF for Sparse Input Views (2403.02063v1)
Abstract: Novel-view synthesis with sparse input views is important for real-world applications like AR/VR and autonomous driving. Recent methods have integrated depth information into NeRFs for sparse input synthesis, leveraging depth prior for geometric and spatial understanding. However, most existing works tend to overlook inaccuracies within depth maps and have low time efficiency. To address these issues, we propose a depth-guided robust and fast point cloud fusion NeRF for sparse inputs. We perceive radiance fields as an explicit voxel grid of features. A point cloud is constructed for each input view, characterized within the voxel grid using matrices and vectors. We accumulate the point cloud of each input view to construct the fused point cloud of the entire scene. Each voxel determines its density and appearance by referring to the point cloud of the entire scene. Through point cloud fusion and voxel grid fine-tuning, inaccuracies in depth values are refined or substituted by those from other views. Moreover, our method can achieve faster reconstruction and greater compactness through effective vector-matrix decomposition. Experimental results underline the superior performance and time efficiency of our approach compared to state-of-the-art baselines.
- Tensorf: Tensorial radiance fields. In European Conference on Computer Vision, 333–350. Springer.
- Depth-supervised nerf: Fewer views and faster training for free. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 12882–12891.
- Mip-NeRF RGB-D: Depth Assisted Fast Neural Radiance Fields. arXiv preprint arXiv:2205.09351.
- Meshnet: Mesh neural network for 3d shape representation. In Proceedings of the AAAI conference on artificial intelligence, volume 33, 8279–8286.
- SparseNeRF: Distilling Depth Ranking for Few-shot Novel View Synthesis. Technical Report.
- Boosting point clouds rendering via radiance mapping. In Proceedings of the AAAI conference on artificial intelligence, volume 37, 953–961.
- Putting nerf on a diet: Semantically consistent few-shot view synthesis. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 5885–5894.
- Large scale multi-view stereopsis evaluation. In Proceedings of the IEEE conference on computer vision and pattern recognition, 406–413.
- Infonerf: Ray entropy minimization for few-shot neural volume rendering. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 12912–12921.
- Decomposing nerf for editing via feature field distillation. Advances in Neural Information Processing Systems, 35: 23311–23330.
- Kinematic-structure-preserved representation for unsupervised 3d human pose estimation. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, 11312–11319.
- Streaming radiance fields for 3d video synthesis. Advances in Neural Information Processing Systems, 35: 13485–13498.
- Neural 3d video synthesis from multi-view video. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5521–5531.
- Local light field fusion: Practical view synthesis with prescriptive sampling guidelines. ACM Transactions on Graphics (TOG), 38(4): 1–14.
- Nerf: Representing scenes as neural radiance fields for view synthesis. Communications of the ACM, 65(1): 99–106.
- Regnerf: Regularizing neural radiance fields for view synthesis from sparse inputs. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5480–5490.
- Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems, 32.
- Surface representation for point clouds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 18942–18952.
- Dense depth priors for neural radiance fields from sparse input views. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 12892–12901.
- Nerf for outdoor scene relighting. In European Conference on Computer Vision, 615–631. Springer.
- Structure-from-motion revisited. In Proceedings of the IEEE conference on computer vision and pattern recognition, 4104–4113.
- Deepvoxels: Learning persistent 3d feature embeddings. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2437–2446.
- ViP-NeRF: Visibility Prior for Sparse Input Neural Radiance Fields. arXiv preprint arXiv:2305.00041.
- JPV-Net: Joint Point-Voxel Representations for Accurate 3D Object Detection. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, 2271–2279.
- Sparf: Neural radiance fields from sparse and noisy poses. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 4190–4200.
- FreeNeRF: Improving Few-shot Neural Rendering with Free Frequency Regularization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 8254–8263.
- Dmis: Dynamic mesh-based importance sampling for training physics-informed neural networks. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, 5375–5383.
- pixelnerf: Neural radiance fields from one or few images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 4578–4587.
- Anisotropic fourier features for neural image-based rendering and relighting. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, 3152–3160.
- Nerf-editing: geometry editing of neural radiance fields. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 18353–18364.
- Stereo magnification: Learning view synthesis using multiplane images. arXiv preprint arXiv:1805.09817.