Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
173 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Binocular-Guided 3D Gaussian Splatting with View Consistency for Sparse View Synthesis (2410.18822v2)

Published 24 Oct 2024 in cs.CV

Abstract: Novel view synthesis from sparse inputs is a vital yet challenging task in 3D computer vision. Previous methods explore 3D Gaussian Splatting with neural priors (e.g. depth priors) as an additional supervision, demonstrating promising quality and efficiency compared to the NeRF based methods. However, the neural priors from 2D pretrained models are often noisy and blurry, which struggle to precisely guide the learning of radiance fields. In this paper, We propose a novel method for synthesizing novel views from sparse views with Gaussian Splatting that does not require external prior as supervision. Our key idea lies in exploring the self-supervisions inherent in the binocular stereo consistency between each pair of binocular images constructed with disparity-guided image warping. To this end, we additionally introduce a Gaussian opacity constraint which regularizes the Gaussian locations and avoids Gaussian redundancy for improving the robustness and efficiency of inferring 3D Gaussians from sparse views. Extensive experiments on the LLFF, DTU, and Blender datasets demonstrate that our method significantly outperforms the state-of-the-art methods.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (71)
  1. Mip-NeRF: A multiscale representation for anti-aliasing neural radiance fields. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 5855–5864, 2021.
  2. Mip-NeRF 360: Unbounded anti-aliased neural radiance fields. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5470–5479, 2022.
  3. Zip-NeRF: Anti-aliased grid-based neural radiance fields. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 19697–19705, 2023.
  4. TensoRF: Tensorial radiance fields. In European Conference on Computer Vision, pages 333–350. Springer, 2022.
  5. MVSNeRF: Fast generalizable radiance field reconstruction from multi-view stereo. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 14124–14133, 2021.
  6. GeoAug: Data augmentation for few-shot nerf with geometry constraints. In European Conference on Computer Vision, pages 322–337. Springer, 2022.
  7. Diffusion-SDF: Conditional generative modeling of signed distance functions. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 2262–2272, 2023.
  8. Depth-supervised NeRF: Fewer views and faster training for free. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12882–12891, 2022.
  9. K-Planes: Explicit radiance fields in space, time, and appearance. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12479–12488, 2023.
  10. Plenoxels: Radiance fields without neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5501–5510, 2022.
  11. Unsupervised monocular depth estimation with left-right consistency. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 270–279, 2017.
  12. Digging into self-supervised monocular depth estimation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 3828–3838, 2019.
  13. SuGaR: Surface-aligned gaussian splatting for efficient 3D mesh reconstruction and high-quality mesh rendering. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5354–5363, 2024.
  14. Multiple View Geometry in Computer Vision. Cambridge university press, 2003.
  15. Tri-MipRF: Tri-Mip representation for efficient anti-aliasing neural radiance fields. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 19774–19783, 2023.
  16. NeuSurf: On-surface priors for neural surface reconstruction from sparse input views. In Proceedings of the AAAI Conference on Artificial Intelligence, 2024.
  17. Putting NeRF on a Diet: Semantically consistent few-shot view synthesis. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 5885–5894, 2021.
  18. Large scale multi-view stereopsis evaluation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 406–413, 2014.
  19. 3D gaussian splatting for real-time radiance field rendering. ACM Transactions on Graphics, 42(4):1–14, 2023.
  20. InfoNeRF: Ray entropy minimization for few-shot neural volume rendering. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12912–12921, 2022.
  21. GeCoNeRF: Few-shot neural radiance fields via geometric consistency. 2023.
  22. DNGaussian: Optimizing sparse-view 3D gaussian radiance fields with global-local depth normalization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 20775–20785, 2024.
  23. NeAF: Learning neural angle fields for point normal estimation. In Proceedings of the AAAI Conference on Artificial Intelligence, 2023.
  24. Learning continuous implicit field with local distance indicator for arbitrary-scale point cloud upsampling. In Proceedings of the AAAI Conference on Artificial Intelligence, 2024.
  25. GeoDream: Disentangling 2D and geometric priors for high-fidelity and consistent 3D generation. arXiv preprint arXiv:2311.17971, 2023.
  26. Towards better gradient consistency for neural signed distance functions via level set alignment. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 17724–17734, 2023.
  27. Local light field fusion: Practical view synthesis with prescriptive sampling guidelines. ACM Transactions on Graphics, 38(4):1–14, 2019.
  28. NeRF: Representing scenes as neural radiance fields for view synthesis. Communications of the ACM, 65(1):99–106, 2021.
  29. Instant neural graphics primitives with a multiresolution hash encoding. ACM Transactions on Graphics, 41(4):1–15, 2022.
  30. RegNeRF: Regularizing neural radiance fields for view synthesis from sparse inputs. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5480–5490, 2022.
  31. CoherentGS: Sparse novel view synthesis with coherent 3D gaussians. In European Conference on Computer Vision, 2024.
  32. Vision transformers for dense prediction. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 12179–12188, 2021.
  33. Towards robust monocular depth estimation: Mixing datasets for zero-shot cross-dataset transfer. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(3):1623–1637, 2020.
  34. Self-evolving depth-supervised 3D gaussian splatting from rendered stereo pairs. 2024.
  35. Structure-from-motion revisited. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pages 4104–4113, 2016.
  36. FlipNeRF: Flipped reflection rays for few-shot novel view synthesis. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 22883–22893, 2023.
  37. Vip-NeRF: Visibility prior for sparse input neural radiance fields. In ACM SIGGRAPH 2023 Conference Proceedings, pages 1–11, 2023.
  38. DäRF: Boosting radiance fields from sparse input views with monocular depth adaptation. Advances in Neural Information Processing Systems, 36, 2024.
  39. Direct voxel grid optimization: Super-fast convergence for radiance fields reconstruction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5459–5469, 2022.
  40. VGOS: Voxel grid optimization for view synthesis from sparse inputs. In Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, pages 1414–1422, 2023.
  41. LoFTR: Detector-free local feature matching with transformers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8922–8931, 2021.
  42. PDC-Net+: Enhanced probabilistic dense correspondence network. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023.
  43. SPARF: Neural radiance fields from sparse and noisy poses. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4190–4200, 2023.
  44. SparseNeRF: Distilling depth ranking for few-shot novel view synthesis. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 9065–9076, 2023.
  45. NeuS: Learning neural implicit surfaces by volume rendering for multi-view reconstruction. Advances in Neural Information Processing Systems, 34:27171–27183, 2021.
  46. Image quality assessment: from error visibility to structural similarity. IEEE Transactions on Image Processing, 13(4):600–612, 2004.
  47. 3D shape reconstruction from 2D images with disentangled attribute flow. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3803–3813, 2022.
  48. GS2Mesh: Surface reconstruction from gaussian splatting via novel stereo views. In ECCV 2024 Workshop on Wild 3D: 3D Modeling, Reconstruction, and Generation in the Wild, 2024.
  49. ReconFusion: 3D reconstruction with diffusion priors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 21551–21561, 2024.
  50. DiffusioNeRF: Regularizing neural radiance fields with denoising diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4180–4189, 2023.
  51. SparseGS: Real-time 360° sparse view synthesis using gaussian splatting. arXiv preprint arXiv:2312.00206, 2023.
  52. MuRF: Multi-baseline radiance fields. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 20041–20050, 2024.
  53. Implicit autoencoder for point-cloud self-supervised representation learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 14530–14542, 2023.
  54. FreeNeRF: Improving few-shot neural rendering with free frequency regularization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8254–8263, 2023.
  55. Depth Anything: Unleashing the power of large-scale unlabeled data. pages 10371–10381, 2024.
  56. Volume rendering of neural implicit surfaces. In Advances in Neural Information Processing Systems, 2021.
  57. pixelNeRF: Neural radiance fields from one or few images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4578–4587, 2021.
  58. Lite-Mono: A lightweight cnn and transformer architecture for self-supervised monocular depth estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 18537–18546, 2023.
  59. The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 586–595, 2018.
  60. Neural signed distance function inference through splatting 3D gaussians pulled on zero-level set. In Advances in Neural Information Processing Systems, 2024.
  61. Zero-shot scene reconstruction from single images with deep prior assembly. In Advances in Neural Information Processing Systems, 2024.
  62. CAP-UDF: Learning unsigned distance functions progressively from raw point clouds with consistency-aware field optimization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024.
  63. Learning a more continuous zero level set in unsigned distance fields through level set projection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023.
  64. Fast learning of signed distance functions from noisy point clouds via noise to noise mapping. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024.
  65. Learning consistency-aware unsigned distance functions progressively from raw point clouds. In Advances in Neural Information Processing Systems, 2022.
  66. Differentiable registration of images and lidar point clouds with voxelpoint-to-pixel matching. In Advances in Neural Information Processing Systems, 2023.
  67. Uni3D: Exploring Unified 3D Representation at Scale. In International Conference on Learning Representations, 2024.
  68. 3D-OAE: Occlusion auto-encoders for self-supervised learning on point clouds. IEEE International Conference on Robotics and Automation, 2024.
  69. DiffGS: Functional gaussian splatting diffusion. In Advances in Neural Information Processing Systems, 2024.
  70. UDiFF: Generating conditional unsigned distance fields with optimal wavelet diffusion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024.
  71. FSGS: Real-time few-shot view synthesis using gaussian splatting. In European Conference on Computer Vision, pages 145–163. Springer, 2024.
Citations (6)

Summary

We haven't generated a summary for this paper yet.