SWinGS: Sliding Windows for Dynamic 3D Gaussian Splatting (2312.13308v2)
Abstract: Novel view synthesis has shown rapid progress recently, with methods capable of producing increasingly photorealistic results. 3D Gaussian Splatting has emerged as a promising method, producing high-quality renderings of scenes and enabling interactive viewing at real-time frame rates. However, it is limited to static scenes. In this work, we extend 3D Gaussian Splatting to reconstruct dynamic scenes. We model a scene's dynamics using dynamic MLPs, learning deformations from temporally-local canonical representations to per-frame 3D Gaussians. To disentangle static and dynamic regions, tuneable parameters weigh each Gaussian's respective MLP parameters, improving the dynamics modelling of imbalanced scenes. We introduce a sliding window training strategy that partitions the sequence into smaller manageable windows to handle arbitrary length scenes while maintaining high rendering quality. We propose an adaptive sampling strategy to determine appropriate window size hyperparameters based on the scene's motion, balancing training overhead with visual quality. Training a separate dynamic 3D Gaussian model for each sliding window allows the canonical representation to change, enabling the reconstruction of scenes with significant geometric changes. Temporal consistency is enforced using a fine-tuning step with self-supervising consistency loss on randomly sampled novel views. As a result, our method produces high-quality renderings of general dynamic scenes with competitive quantitative performance, which can be viewed in real-time in our dynamic interactive viewer.
- Marc Alexa. Linear combination of transformations. Proceedings of the 29th annual conference on Computer graphics and interactive techniques, 2002.
- HyperReel: High-fidelity 6-DoF video with ray-conditioned sampling. In CVPR, 2023.
- Neural Pixel Composition for 3D-4D View Synthesis from Multi-Views. In CVPR, 2023.
- 4D Visualization of Dynamic Events from Unconstrained Multi-View Videos. In CVPR, 2020.
- Mip-nerf: A multiscale representation for anti-aliasing neural radiance fields. In ICCV, pages 5835–5844, 2021.
- Robust Dense Mapping for Large-Scale Dynamic Environments. In ICRA, 2018.
- HexPlane: A Fast Representation for Dynamic Scenes. In CVPR, 2023.
- Tensorf: Tensorial radiance fields. In ECCV, 2022.
- Neural radiance flow for 4d view synthesis and video processing. In ICCV, 2021.
- Fast Dynamic Radiance Fields with Time-Aware Neural Voxels. In SIGGRAPH Asia 2022 Conference Papers, 2022.
- K-Planes: Explicit Radiance Fields in Space, Time, and Appearance. In CVPR, 2023.
- Baking Neural Radiance Fields for Real-Time View Synthesis. In ICCV, 2021.
- Humanrf: High-fidelity neural radiance fields for humans in motion. ACM TOG, 42(4):1–12, 2023.
- Map visibility estimation for large-scale dynamic 3d reconstruction. In CVPR, 2014.
- 3D Gaussian Splatting for Real-Time Radiance Field Rendering. ACM TOG, 42(4), 2023.
- D-NeRF: Neural Radiance Fields for Dynamic Scenes. In CVPR, 2020.
- Streaming Radiance Fields for 3D Video Synthesis. In NeurIPS, 2022a.
- TAVA: Template-free animatable volumetric actors. In ECCV, 2022b.
- Neural 3D Video Synthesis from Multi-view Video. In CVPR, 2022c.
- Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes. In CVPR, 2021.
- DynIBaR: Neural Dynamic Image-Based Rendering. In CVPR, 2023.
- Deep 3D Mask Volume for View Synthesis of Dynamic Scenes. In ICCV, 2021.
- Robust Dynamic Radiance Fields. In CVPR, 2023.
- Neural Volumes: Learning Dynamic Renderable Volumes from Images. ACM TOG, 38(4), 2019.
- Track to reconstruct and reconstruct to track. IEEE Robotics and Automation Letters, 5(2):1803–1810, 2020.
- Dynamic 3D Gaussians: Tracking by Persistent Dynamic View Synthesis. In 3DV, 2024.
- Tunable convolutions with parametric multi-loss optimization. In CVPR, 2023.
- NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis. In ECCV, 2020.
- Instant neural graphics primitives with a multiresolution hash encoding. ACM Trans. Graph., 41(4):102:1–102:15, 2022.
- Dynamicfusion: Reconstruction and tracking of non-rigid scenes in real-time. In CVPR, 2015.
- Regnerf: Regularizing neural radiance fields for view synthesis from sparse inputs. In CVPR, pages 5470–5480, 2021.
- Nerfies: Deformable Neural Radiance Fields. ICCV, 2021a.
- HyperNeRF: A Higher-Dimensional Representation for Topologically Varying Neural Radiance Fields. ACM TOG, 40(6), 2021b.
- Animatable neural radiance fields for modeling dynamic human bodies. In ICCV, 2021.
- Representing Volumetric Videos as Dynamic MLP Maps. In CVPR, 2023.
- KiloNeRF: Speeding up Neural Radiance Fields with Thousands of Tiny MLPs. In ICCV, 2021.
- Structure-from-motion revisited. In CVPR, pages 4104–4113, 2016.
- Tensor4D: Efficient Neural 4D Decomposition for High-fidelity Dynamic Reconstruction and Rendering. In CVPR, 2023.
- KillingFusion: Non-rigid 3D Reconstruction without Correspondences. In CVPR, 2017.
- NeRFPlayer: A Streamable Dynamic Scene Representation with Decomposed Neural Radiance Fields. IEEE Transactions on Visualization and Computer Graphics, 29(5):2732–2742, 2023.
- RAFT: Recurrent All-Pairs Field Transforms for Optical Flow. In ECCV, 2020.
- SUDS: Scalable Urban Dynamic Scenes. In CVPR, 2023.
- Mixed Neural Voxels for Fast Multi-view Video Synthesis. In ICCV, 2023.
- Fourier plenoctrees for dynamic radiance field rendering in real-time. In CVPR, 2022.
- IBRNet: Learning Multi-View Image-Based Rendering. In CVPR, 2021.
- HumanNeRF: Free-Viewpoint Rendering of Moving People From Monocular Video. In CVPR, 2022.
- Fast-vqa: Efficient end-to-end video quality assessment with fragment sampling. In ECCV, 2022.
- Space-time Neural Irradiance Fields for Free-Viewpoint Video. In CVPR, 2021.
- Temporal-MPI: Enabling Multi-Plane Images For Dynamic Scene Modelling Via Temporal Basis Learning. In ECCV, page 323–338, 2022.
- PlenOctrees for real-time rendering of neural radiance fields. In ICCV, 2021.
- Humannerf: Efficiently generated human radiance field from sparse inputs. In CVPR, pages 7743–7753, 2022.
- Stereo magnification: Learning view synthesis using multiplane images. ACM TOG, 37, 2018.
- Richard Shaw (25 papers)
- Jifei Song (18 papers)
- Arthur Moreau (11 papers)
- Michal Nazarczuk (9 papers)
- Sibi Catley-Chandar (10 papers)
- Helisa Dhamo (14 papers)
- Eduardo Perez-Pellitero (4 papers)