Motion Blender Gaussian Splatting for Dynamic Scene Reconstruction (2503.09040v2)

Published 12 Mar 2025 in cs.CV and cs.RO

Abstract: Gaussian splatting has emerged as a powerful tool for high-fidelity reconstruction of dynamic scenes. However, existing methods primarily rely on implicit motion representations, such as encoding motions into neural networks or per-Gaussian parameters, which makes it difficult to further manipulate the reconstructed motions. This lack of explicit controllability limits existing methods to replaying recorded motions only, which hinders a wider application in robotics. To address this, we propose Motion Blender Gaussian Splatting (MBGS), a novel framework that uses motion graphs as an explicit and sparse motion representation. The motion of a graph's links is propagated to individual Gaussians via dual quaternion skinning, with learnable weight painting functions that determine the influence of each link. The motion graphs and 3D Gaussians are jointly optimized from input videos via differentiable rendering. Experiments show that MBGS achieves state-of-the-art performance on the highly challenging iPhone dataset while being competitive on HyperNeRF. We demonstrate the application potential of our method in animating novel object poses, synthesizing real robot demonstrations, and predicting robot actions through visual planning. The source code, models, video demonstrations can be found at http://mlzxy.github.io/motion-blender-gs.

Summary

Overview of "Motion Blender Gaussian Splatting for Dynamic Reconstruction"

The paper “Motion Blender Gaussian Splatting for Dynamic Reconstruction” presents Motion Blender Gaussian Splatting (MB-GS), an innovative framework for reconstructing dynamic scenes from video data. Its core contribution includes introducing an explicit motion representation using motion graphs to address the limitations of existing methods that rely on implicit motion encoding in neural networks or parameters, which restrict their applicability in manipulating reconstructed motions.

Challenges in Current Methods

Existing Gaussian splatting-based approaches, such as 4DGaussians and Deformable-GS, primarily utilize implicit motion representations, making it challenging to control and manipulate motions beyond replaying them. These methods employ dense motion parameters or neural networks to encode motion, limiting interpretability and editability. The fundamental challenge lies in achieving explicit and sparse motion representation without sacrificing the ability to reconstruct complex dynamic scenes.

Methodology

MB-GS addresses these challenges by representing motion with explicit motion graphs. It employs two types of graph structures: kinematic trees and deformable graphs. Kinematic trees are apt for articulated objects, capturing their joint-based movements through parameters like rotations and translations at the joints. Deformable graphs are boundary-free and capture non-rigid deformations of objects, making them ideal for modeling soft body dynamics.

The motion of graph links is propagated to individual Gaussians through dual quaternion skinning, allowing each Gaussian’s movement to be influenced by sparse links instead of dense motion parameters. A learnable weight painting function determines the influence of each graph link on Gaussian motion, thus introducing interpretability and providing control at an object or segment level.

Numerical Results and Comparisons

The proposed MB-GS achieves notable results, outperforming state-of-the-art methods in various settings: it surpasses Shape-of-Motion in the iPhone dataset and provides competitive rendering quality against 4DGaussians on the HyperNeRF dataset. The paper reports a 2% improvement in LPIPS scores on the iPhone dataset, which gauges the perceptual quality of renderings, indicating sharper and more visually satisfying dynamic scene reconstructions.

Implications and Applications

MB-GS offers significant implications for dynamic scene reconstruction, particularly in animation and robotics. Its capability to generate novel object motions and synthesize robot demonstrations from general videos demonstrates its practicality. The framework supports efficient dataset creation for robotic learning, potentially accelerating training processes. Moreover, the explicit nature of motion graphs facilitates novel animations by allowing direct editing of the underlying motion parameters, which can be particularly valuable in content creation and virtual reality applications.

Limitations and Future Directions

While exhibiting remarkable adaptability and control, MB-GS faces challenges such as rendering artifacts from rapid camera movements in certain scenarios, as seen in some HyperNeRF datasets. Future work may explore integrating deformation networks to predict motion graph parameters, enhancing model adaptability while maintaining control and interpretability. Moreover, addressing limitations like imperfect motion graph learning and noise sensitivity in initialization could further refine this method’s performance.

In conclusion, Motion Blender Gaussian Splatting redefines the approach to dynamic scene reconstruction by combining explicit motion representation with high-quality rendering capabilities. While offering promising applications in various fields, potential improvements remain, heralding future research directions.

Related Papers

GitHub

Motion Blender Gaussian Splatting

Tweets

https://twitter.com/kwangmoo_yi/status/1900302729534157012