SWinGS: Sliding Windows for Dynamic 3D Gaussian Splatting (2312.13308v2)

Published 20 Dec 2023 in cs.CV

Abstract: Novel view synthesis has shown rapid progress recently, with methods capable of producing increasingly photorealistic results. 3D Gaussian Splatting has emerged as a promising method, producing high-quality renderings of scenes and enabling interactive viewing at real-time frame rates. However, it is limited to static scenes. In this work, we extend 3D Gaussian Splatting to reconstruct dynamic scenes. We model a scene's dynamics using dynamic MLPs, learning deformations from temporally-local canonical representations to per-frame 3D Gaussians. To disentangle static and dynamic regions, tuneable parameters weigh each Gaussian's respective MLP parameters, improving the dynamics modelling of imbalanced scenes. We introduce a sliding window training strategy that partitions the sequence into smaller manageable windows to handle arbitrary length scenes while maintaining high rendering quality. We propose an adaptive sampling strategy to determine appropriate window size hyperparameters based on the scene's motion, balancing training overhead with visual quality. Training a separate dynamic 3D Gaussian model for each sliding window allows the canonical representation to change, enabling the reconstruction of scenes with significant geometric changes. Temporal consistency is enforced using a fine-tuning step with self-supervising consistency loss on randomly sampled novel views. As a result, our method produces high-quality renderings of general dynamic scenes with competitive quantitative performance, which can be viewed in real-time in our dynamic interactive viewer.

References (52)

Authors (7)

Richard Shaw (25 papers)
Jifei Song (18 papers)
Arthur Moreau (11 papers)
Michal Nazarczuk (9 papers)
Sibi Catley-Chandar (10 papers)
Helisa Dhamo (14 papers)
Eduardo Perez-Pellitero (4 papers)

Citations (6)

View on Semantic Scholar

Summary

The paper introduces SWAGS, an adaptive sliding-window technique that extends 3D Gaussian splatting to dynamic scene reconstruction.
It employs per-window dynamic modeling with tuneable MLPs to effectively separate static and dynamic scene elements for enhanced rendering quality.
The method achieves real-time performance at over 71 FPS while ensuring temporal consistency through self-supervised fine-tuning.

Overview

The paper introduces a novel approach to novel view synthesis called SWAGS (Sampling Windows Adaptively for Dynamic 3D Gaussian Splatting). Unlike previous methodologies that are typically restricted to static scenes, the new method extends the capabilities of 3D Gaussian Splatting, a technique that has gained attention for high-quality renderings, to reconstruct dynamic scenes for real-time interactive viewing.

Motivation and Challenges

Prevailing synthetic view techniques operating on 3D Gaussian principles face challenges with dynamic scenes, such as changing topologies, significant geometric modifications, or extended sequences leading to blurred results. Contemporary works that offer dynamic reconstruction often suffer from either lack of temporal consistency or heavy computational loads, making them impractical for real-time applications.

Innovative Approach

The introduced method tackles these challenges with several key innovations:

Adaptive Window Sampling: The dynamic scenes are divided into windows of varying lengths based on motion intensity, enabling the handling of arbitrary-length sequences while maintaining high render quality.
Per-Window Dynamic 3D Gaussian Splatting: Each window uses its own set of dynamic 3D Gaussians modeled by tuneable MLP (Multilayer Perceptron) parameters, which focus on dynamic region reconstruction to disentangle static and dynamic scene parts.
Temporal Consistency Enforcement: Through a fine-tuning process utilizing self-supervising consistency loss on novel views, the method enforces temporal consistency between windows, significantly reducing flickering and ensuring smooth scene transition.

Training and Implementation

For the model training, the authors utilized a PyTorch-based implementation of 3D Gaussian Splatting, beginning with point cloud data from COLMAP. They benefitted from training each window's model in parallel to expedite the process and fine-tuned each afterward to ensure temporal consistency. Their adaptation allowed for real-time frame-rate rendering with superior reconstructive quality of dynamic scenes, including those with significant motion like flames.

Results and Contributions

The authors conducted extensive comparative studies, demonstrating that their method surpasses the current state-of-the-art techniques in rendering quality as assessed by PSNR (Peak Signal-to-Noise Ratio) and SSIM (Structural Similarity Index). Furthermore, it exhibits real-time performance at 71.51 FPS (Frames Per Second), a significant improvement over competing methods.

Conclusion

SWAGS revolutionizes the field of novel view synthesis by delivering high-fidelity, real-time interactive rendering of dynamic scenes previously unachievable due to the limitations of static model dependant methods. Through its adaptive window sampling, dynamic 3D Gaussian splatting, and groundbreaking fine-tuning practices, it sets a new benchmark for future developments in the domain.

PDF Markdown

Related Papers

Tweets

https://twitter.com/1565330182176911367/status/1738101307444195602

https://twitter.com/NeanderBass/status/1836093735970451854