Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
162 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

GauFRe: Gaussian Deformation Fields for Real-time Dynamic Novel View Synthesis (2312.11458v2)

Published 18 Dec 2023 in cs.CV

Abstract: We propose a method that achieves state-of-the-art rendering quality and efficiency on monocular dynamic scene reconstruction using deformable 3D Gaussians. Implicit deformable representations commonly model motion with a canonical space and time-dependent backward-warping deformation field. Our method, GauFRe, uses a forward-warping deformation to explicitly model non-rigid transformations of scene geometry. Specifically, we propose a template set of 3D Gaussians residing in a canonical space, and a time-dependent forward-warping deformation field to model dynamic objects. Additionally, we tailor a 3D Gaussian-specific static component supported by an inductive bias-aware initialization approach which allows the deformation field to focus on moving scene regions, improving the rendering of complex real-world motion. The differentiable pipeline is optimized end-to-end with a self-supervised rendering loss. Experiments show our method achieves competitive results and higher efficiency than both previous state-of-the-art NeRF and Gaussian-based methods. For real-world scenes, GauFRe can train in ~20 mins and offer 96 FPS real-time rendering on an RTX 3090 GPU. Project website: https://lynl7130.github.io/gaufre/index.html

Definition Search Book Streamline Icon: https://streamlinehq.com
References (44)
  1. Neural point-based graphics. 2020.
  2. Hexplane: A fast representation for dynamic scenes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 130–141, 2023.
  3. Tensorf: Tensorial radiance fields. In European Conference on Computer Vision, pages 333–350. Springer, 2022.
  4. Fast dynamic radiance fields with time-aware neural voxels. In SIGGRAPH Asia 2022 Conference Papers, 2022a.
  5. Fast dynamic radiance fields with time-aware neural voxels. In SIGGRAPH Asia 2022 Conference Papers, pages 1–9, 2022b.
  6. Plenoxels: Radiance fields without neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5501–5510, 2022.
  7. K-planes: Explicit radiance fields in space, time, and appearance. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12479–12488, 2023.
  8. V4d: Voxel for 4d novel view synthesis. IEEE Transactions on Visualization and Computer Graphics, 2023.
  9. Neural deformable voxel grid for fast optimization of dynamic view synthesis. In Proceedings of the Asian Conference on Computer Vision, pages 3757–3775, 2022.
  10. D-tensorf: Tensorial radiance fields for dynamic scenes. arXiv preprint arXiv:2212.02375, 2022.
  11. Relu fields: The little non-linearity that could. In ACM SIGGRAPH 2022 Conference Proceedings, pages 1–9, 2022.
  12. 3d gaussian splatting for real-time radiance field rendering. ACM Transactions on Graphics, 42(4), 2023.
  13. Differentiable diffusion for dense depth estimation from multi-view images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021.
  14. Point-based neural rendering with per-view optimization. Computer Graphics Forum (Proceedings of the Eurographics Symposium on Rendering), 40(4), 2021.
  15. arXiv:2004.07484, 2020.
  16. Neural scene flow fields for space-time view synthesis of dynamic scenes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021.
  17. Semantic attention flow fields for monocular dynamic scene decomposition. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 21797–21806, 2023.
  18. Dynamic 3d gaussians: Tracking by persistent dynamic view synthesis. In 3DV, 2024.
  19. GaussiGAN: Controllable image synthesis with 3d gaussians from unposed silhouettes. In British Machine Vision Conference, 2021.
  20. Nerf: Representing scenes as neural radiance fields for view synthesis. In ECCV, 2020.
  21. Instant neural graphics primitives with a multiresolution hash encoding. ACM Transactions on Graphics (ToG), 41(4):1–15, 2022.
  22. Nerfies: Deformable neural radiance fields. ICCV, 2021a.
  23. Hypernerf: A higher-dimensional representation for topologically varying neural radiance fields. ACM Trans. Graph., 40(6), 2021b.
  24. Temporal interpolation is all you need for dynamic neural radiance fields. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4212–4221, 2023.
  25. Dynamic point fields. arXiv preprint arXiv:2304.02626, 2023.
  26. D-nerf: Neural radiance fields for dynamic scenes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10318–10327, 2021.
  27. A versatile scene model with differentiable visibility applied to generative pose estimation. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2015.
  28. Adop: Approximate differentiable one-pixel point rendering. arXiv preprint arXiv:2110.06635, 2021.
  29. Tensor4d: Efficient neural 4d decomposition for high-fidelity dynamic reconstruction and rendering. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16632–16642, 2023.
  30. Fast articulated motion tracking using a sums of gaussians body model. In 2011 International Conference on Computer Vision, pages 951–958. IEEE, 2011.
  31. Direct voxel grid optimization: Super-fast convergence for radiance fields reconstruction. 2022 ieee. In CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 5449–5459, 2021.
  32. Non-rigid neural radiance fields: Reconstruction and novel view synthesis of a dynamic scene from monocular video. 2021 IEEE/CVF International Conference on Computer Vision (ICCV), pages 12939–12950, 2021.
  33. 4d gaussian splatting for real-time dynamic scene rendering. arXiv preprint arXiv:2310.08528, 2023.
  34. D^ 2nerf: Self-supervised decoupling of dynamic and static objects from a monocular video. Advances in Neural Information Processing Systems, 35:32653–32666, 2022.
  35. Neural fields in visual computing and beyond. Computer Graphics Forum, 2022.
  36. Point-nerf: Point-based neural radiance fields. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5438–5448, 2022.
  37. Nerf-ds: Neural radiance fields for dynamic specular objects. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8285–8295, 2023.
  38. Deformable 3d gaussians for high-fidelity monocular dynamic scene reconstruction, 2023a.
  39. Real-time photorealistic dynamic scene representation and rendering with 4d gaussian splatting, 2023b.
  40. Differentiable surface splatting for point-based geometry processing. ACM Transactions on Graphics (proceedings of ACM SIGGRAPH ASIA), 38(6), 2019.
  41. Plenoctrees for real-time rendering of neural radiance fields. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 5752–5761, 2021.
  42. Differentiable point-based radiance fields for efficient view synthesis. arXiv preprint arXiv:2205.14330, 2022.
  43. Drivable 3d gaussian avatars. 2023.
  44. Surface splatting. In Proceedings of the 28th annual conference on Computer graphics and interactive techniques, pages 371–378, 2001.
Citations (31)

Summary

  • The paper introduces GauFRe, employing a time-dependent MLP to deform canonical Gaussians for dynamic scene reconstruction.
  • It separates dynamic and static regions using distinct Gaussian sets, enhancing accuracy and optimizing computational resources.
  • Results show that GauFRe attains state-of-the-art quality with faster optimization and real-time rendering for interactive applications.

Introduction

The field of 3D reconstruction from 2D images poses numerous challenges, particularly when reconstructing dynamic scenes from a single, moving camera's footage. Previous methods for addressing this issue rely on a diverse range of techniques, many of which come with their own sets of trade-offs concerning quality, optimization speed, and rendering speed. A new method, GauFRe, leverages the efficiency of Gaussian splatting extended with deformable 3D Gaussians, to accommodate dynamic changes, offering a favorable balance of high-quality reconstruction with real-time rendering capabilities.

Dynamic Scene Reconstruction Methodology

GauFRe differentiates itself by utilizing a multi-layer perceptron (MLP) to define a time-dependent deformation field, transforming a canonical arrangement of Gaussians to represent movement and deformation within a scene. Furthermore, the method acknowledges that natural scenes often contain large static regions. By employing both dynamic and static Gaussians—where static regions are represented by a separate, undeformable set—the MLP can concentrate on representing dynamic elements more accurately. The process involves optimizing the dynamic and static Gaussians with a self-supervised rendering loss, allowing for more efficient computing resource allocation and improved final image quality.

Performance and Validation

When evaluating GauFRe against several baselines on both synthetic and real-world datasets, it achieves quality on par with state-of-the-art alternatives while ensuring faster optimization and real-time rendering. These attributes suggest notable advantages in scenarios requiring rapid deployment or interactive applications. The deformation of Gaussians, coupled with the incorporation of a static Gaussian point cloud, delivers higher quality reconstructions with less computational expense.

Conclusion on GauFRe's Implications

The innovation brought forward by GauFRe marks a significant contribution. Not only does it provide a tool for rapid and high-quality reconstruction of dynamic scenes from monocular video inputs, but it also opens doors for applications in virtual reality, gaming, and video editing where real-world dynamic events need to be recreated and manipulated with high fidelity and efficiency. The dynamic/static separation is a particularly clever feature that allows more focused processing on movement within a scene, highlighting the potential for even more nuanced adaptations and extensions of this method in the future.

Github Logo Streamline Icon: https://streamlinehq.com
X Twitter Logo Streamline Icon: https://streamlinehq.com