COLMAP-Free 3D Gaussian Splatting (2312.07504v2)
Abstract: While neural rendering has led to impressive advances in scene reconstruction and novel view synthesis, it relies heavily on accurately pre-computed camera poses. To relax this constraint, multiple efforts have been made to train Neural Radiance Fields (NeRFs) without pre-processed camera poses. However, the implicit representations of NeRFs provide extra challenges to optimize the 3D structure and camera poses at the same time. On the other hand, the recently proposed 3D Gaussian Splatting provides new opportunities given its explicit point cloud representations. This paper leverages both the explicit geometric representation and the continuity of the input video stream to perform novel view synthesis without any SfM preprocessing. We process the input frames in a sequential manner and progressively grow the 3D Gaussians set by taking one input frame at a time, without the need to pre-compute the camera poses. Our method significantly improves over previous approaches in view synthesis and camera pose estimation under large motion changes. Our project page is https://oasisyang.github.io/colmap-free-3dgs
- Hyperreel: High-fidelity 6-dof video with ray-conditioned sampling. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 16610–16620, 2023.
- Mip-nerf: A multiscale representation for anti-aliasing neural radiance fields. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 5855–5864, 2021.
- Mip-nerf 360: Unbounded anti-aliased neural radiance fields. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5470–5479, 2022.
- Zip-nerf: Anti-aliased grid-based neural radiance fields. arXiv preprint arXiv:2304.06706, 2023.
- Nope-nerf: Optimising neural radiance field with no pose prior. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4160–4169, 2023.
- Lu-nerf: Scene and pose estimation by synchronizing local unposed nerfs. arXiv preprint arXiv:2306.05410, 2023.
- Garf: Gaussian activated radiance fields for high fidelity reconstruction and pose estimation. arXiv e-prints, 2022.
- Mononerf: Learning generalizable nerfs from monocular videos without camera pose. In ICML, 2023.
- Fastnerf: High-fidelity neural rendering at 200fps. In ICCV, 2021.
- Multiple view geometry in computer vision. 2003.
- Automatic photo pop-up. In ACM SIGGRAPH 2005 Papers, pages 577–584, 2005.
- Tour into the picture: using a spidery mesh interface to make animation from a single image. In Proceedings of the 24th annual conference on Computer graphics and interactive techniques, pages 225–232, 1997.
- Worldsheet: Wrapping the world in a 3d sheet for view synthesis from a single image. In ICCV, 2020.
- Self-calibrating neural radiance fields. In ICCV, 2021.
- 3d gaussian splatting for real-time radiance field rendering. ACM Transactions on Graphics (ToG), 42(4):1–14, 2023.
- Infonerf: Ray entropy minimization for few-shot neural volume rendering. In CVPR, 2022.
- Tanks and temples: Benchmarking large-scale scene reconstruction. ACM Transactions on Graphics, 2017.
- Neural point catacaustics for novel-view synthesis of reflections. ACM Transactions on Graphics (TOG), 41(6):1–15, 2022.
- Video autoencoder: self-supervised disentanglement of 3d structure and motion. In ICCV, 2021.
- Mine: Towards continuous depth mpi with nerf for novel view synthesis. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 12578–12588, 2021.
- Barf: Bundle-adjusting neural radiance fields. In ICCV, 2021.
- Neural sparse voxel fields. Advances in Neural Information Processing Systems, 33:15651–15663, 2020.
- Dynamic 3d gaussians: Tracking by persistent dynamic view synthesis. arXiv preprint arXiv:2308.09713, 2023.
- Progressively optimized local radiance fields for robust view synthesis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16539–16548, 2023.
- Nerf: Representing scenes as neural radiance fields for view synthesis. Communications of the ACM, 2021.
- Instant neural graphics primitives with a multiresolution hash encoding. ACM Trans. Graph.
- Orb-slam: a versatile and accurate monocular slam system. IEEE transactions on robotics, 2015.
- Regnerf: Regularizing neural radiance fields for view synthesis from sparse inputs. In CVPR, 2022.
- Automatic differentiation in pytorch. 2017.
- Vision transformers for dense prediction. In ICCV, 2021.
- Kilonerf: Speeding up neural radiance fields with thousands of tiny mlps. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 14335–14345, 2021.
- Common objects in 3d: Large-scale learning and evaluation of real-life 3d category reconstruction. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 10901–10911, 2021.
- Free view synthesis. In ECCV, 2020.
- Stable view synthesis. In CVPR, 2021.
- Rust: Latent neural scene representations from unposed imagery. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 17297–17306, 2023.
- Plenoxels: Radiance fields without neural networks. In CVPR, 2022.
- Structure-from-motion revisited. In CVPR, 2016.
- Implicit neural representations with periodic activation functions. NeurIPS, 2020.
- Flowcam: Training generalizable 3d radiance fields without camera poses via pixel-aligned scene flow. arXiv preprint arXiv:2306.00180, 2023.
- Direct voxel grid optimization: Super-fast convergence for radiance fields reconstruction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5459–5469, 2022.
- Visual slam algorithms: A survey from 2010 to 2016. IPSJ Transactions on Computer Vision and Applications, 9(1):1–11, 2017.
- Single-view view synthesis with multiplane images. In CVPR, 2020.
- Shinji Umeyama. Least-squares estimation of transformation parameters between two point patterns. IEEE Transactions on Pattern Analysis & Machine Intelligence, 13(04):376–380, 1991.
- Ref-nerf: Structured view-dependent appearance for neural radiance fields. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 5481–5490. IEEE, 2022.
- Image quality assessment: from error visibility to structural similarity. IEEE transactions on image processing, 2004.
- NeRF−−--- -: Neural radiance fields without known camera parameters. arXiv preprint arXiv:2102.07064, 2021.
- 4d gaussian splatting for real-time dynamic scene rendering. arXiv preprint arXiv:2310.08528, 2023.
- Sinerf: Sinusoidal neural radiance fields for joint pose estimation and scene reconstruction. 2022.
- Sinnerf: Training neural radiance fields on complex scenes from a single image. In European Conference on Computer Vision, pages 736–753. Springer, 2022a.
- Point-nerf: Point-based neural radiance fields. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5438–5448, 2022b.
- Freenerf: Improving few-shot neural rendering with free frequency regularization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8254–8263, 2023.
- inerf: Inverting neural radiance fields for pose estimation. In IROS, 2021.
- Differentiable surface splatting for point-based geometry processing. ACM Transactions on Graphics (TOG), 38(6):1–14, 2019.
- Plenoctrees for real-time rendering of neural radiance fields. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 5752–5761, 2021.
- Nerf++: Analyzing and improving neural radiance fields. arXiv preprint arXiv:2010.07492, 2020.
- Differentiable point-based radiance fields for efficient view synthesis. In SIGGRAPH Asia 2022 Conference Papers, pages 1–12, 2022.
- The unreasonable effectiveness of deep features as a perceptual metric. In CVPR, 2018.
- Stereo magnification: Learning view synthesis using multiplane images. 2018.