Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
GPT-5.1
GPT-5.1 96 tok/s
Gemini 3.0 Pro 48 tok/s Pro
Gemini 2.5 Flash 155 tok/s Pro
Kimi K2 197 tok/s Pro
Claude Sonnet 4.5 36 tok/s Pro
2000 character limit reached

MoE-GS for Dynamic Gaussian Splatting

Updated 23 October 2025
  • The paper presents MoE-GS, a framework that integrates multiple specialized 3D Gaussian Splatting models using adaptive routing to improve dynamic scene reconstruction fidelity.
  • It employs a novel Volume-aware Pixel Router to project Gaussian-level gating weights into pixel space, ensuring coherent expert blending across spatial and temporal dimensions.
  • Efficiency strategies like single-pass rendering, gate-aware pruning, and knowledge distillation optimize computational cost while maintaining high visual quality.

Mixture of Experts for Dynamic Gaussian Splatting (MoE-GS) is an advanced framework designed to address the limitations of traditional dynamic scene reconstruction by integrating multiple specialized 3D Gaussian Splatting (3DGS) models via a novel adaptive routing mechanism. In the context of dynamic 3D scene synthesis and rendering, MoE-GS improves reconstruction fidelity and robustness to diverse scene characteristics, dynamically blending the strengths of several expert models to accommodate spatial and temporal variability. The core technical contribution is the Volume-aware Pixel Router, which projects learned Gaussian-level gating weights into pixel space using differentiable weight splatting, enabling spatially and temporally coherent expert blending. The framework further introduces single-pass multi-expert rendering, gate-aware Gaussian pruning, and a distillation strategy, ensuring competitive efficiency despite the increased capacity of MoE architectures (Jin et al., 22 Oct 2025).

1. Motivation and Limitations of Previous Approaches

Dynamic Gaussian Splatting methods have become central for real-time, high-fidelity dynamic 3D scene reconstruction, but prior models exhibit three primary deficiencies: lack of consistent performance across heterogeneous scenes; spatial inconsistency, where differing regions require different modeling strengths; and inadequate handling of temporal fluctuation, as frame-by-frame dynamics vary substantially. Existing methods typically rely on a single 3DGS model or statically partition scene components, which leads to either underfitting (in complex, rapidly changing zones) or model redundancy (in more static regions) and consequent inefficiency in both training and inference (Guo et al., 18 Mar 2024, Lee et al., 21 Oct 2024).

MoE-GS addresses these issues by leveraging multiple experts—each tailored to specific deformation, motion profiles, or appearance regimes—and a dynamic, volumetric-to-pixel gating process to blend their outputs contextually.

2. MoE-GS Architecture and Volume-aware Pixel Router

The principal innovation of MoE-GS is the Volume-aware Pixel Router, a differentiable mechanism for mapping per-Gaussian routing weights into pixel space. Each expert is a separately optimized 3DGS model specializing in particular spatial–temporal features (e.g., rapid non-rigid motion, static background, fine texture detail).

Per-Gaussian weights encode both intrinsic properties (position, rotation, scale, opacity) and contextual dependencies (directional/view cue, time signal):

wiper=[wi,widir,(twitime)]Tw_i^{\mathrm{per}} = [w_i, w_i^{\mathrm{dir}}, (t \cdot w_i^{\mathrm{time}})]^T

These weights are rasterized onto the image plane using differentiable splatting. The resulting pixel-level weights w2Dw_{2D} are refined via a lightweight MLP with directional and temporal embedding:

R=w2D+Φ(w2Ddir,w2Dtime,r)R' = w_{2D} + \Phi(w_{2D}^{\mathrm{dir}}, w_{2D}^{\mathrm{time}}, r)

where rr denotes the viewing direction. The final expert selection probabilities at each pixel are computed as:

Gk=Softmax(Rk)G'_k = \mathrm{Softmax}(R'_k)

The MoE-GS output at pixel uu is then

IMoE(u)=kGk(u)IEk(u)I_{\text{MoE}}(u) = \sum_k G'_k(u) \cdot I_{E_k}(u)

This design ensures that the gating decision is informed by volumetric properties and projected onto the appropriate image pixels, adapting to both spatial region and dynamic context.

3. Efficiency Strategies: Single-Pass Rendering and Gate-aware Pruning

Since MoE architectures typically incur greater computational cost than single-expert models, MoE-GS incorporates two optimizations.

Single-Pass Multi-Expert Rendering: All Gaussians are processed in one pass, tagged via a one-hot expert identity. The rendering equation avoids repeated rasterization:

Ck(u)=jTj(u)αj(u)cj(ej)kC_k(u) = \sum_j T_j(u) \cdot \alpha_j(u) \cdot c_j \cdot (e_j)_k

where Tj(u)T_j(u) is the cumulative transmittance, αj(u)\alpha_j(u) the opacity, cjc_j the color, and (ej)k(e_j)_k identifies expert kk's Gaussians. This approach computes all expert outputs in parallel, separating them at the alpha blending stage only.

Gate-aware Gaussian Pruning: To mitigate model redundancy, the router accumulates the gradient of gating weights with respect to per-Gaussian parameters across the dataset:

Ei=1DvDGk(v)wiper(v)\mathcal{E}_i = \frac{1}{|\mathcal{D}|}\sum_{v \in \mathcal{D}} \left\| \frac{\partial G'_k(v)}{\partial w_i^{\mathrm{per}}(v)} \right\|

Experts whose importance Ei\mathcal{E}_i falls below a threshold τ\tau are pruned during training, which preserves fidelity while reducing memory and compute resources.

4. Knowledge Distillation for Lightweight Deployment

MoE-GS employs a distillation procedure that enables the transfer of fusion performance to individual experts, supporting lightweight inference without changes to expert architectures. After MoE optimization, the MoE-rendered image IMoEI_{\text{MoE}} acts as the pseudo-ground truth for each expert. Confidence weights GkG'_k are used as attention for supervised training. The distillation loss per expert is:

LkKD=λL(GkIEk,GkIGT)+(1λ)L((1Gk)IEk,(1Gk)IMoE)\mathcal{L}_k^{\mathrm{KD}} = \lambda \cdot \mathcal{L}(G'_k \cdot I_{E_k}, G'_k \cdot I_{\text{GT}}) + (1-\lambda) \cdot \mathcal{L}((1-G'_k) \cdot I_{E_k}, (1-G'_k) \cdot I_{\text{MoE}})

This formulation encourages each expert to match the MoE fusion in high-confidence regions while using its own output otherwise, allowing real-time deployment of single experts with minimal fidelity loss.

5. Experimental Evaluation and Benchmark Performance

MoE-GS is validated on standard dynamic scene datasets (N3V, Technicolor), demonstrating consistent improvements over individual expert models and previous state-of-the-art 3DGS frameworks. Quantitative metrics include PSNR, SSIM, and LPIPS:

  • On N3V, MoE-GS configurations (2/3/4 experts) achieve higher PSNR than STG, Ex4DGS, and 4DGaussians baselines.
  • On Technicolor, 3-expert MoE-GS ranks highest in PSNR, SSIM, and LPIPS across multiple scenes.
  • Efficiency optimizations (single-pass rendering, pruning) improve FPS and memory usage while preserving or increasing visual fidelity.
  • Qualitative results show sharper reconstructions and robust temporal consistency, attributed to adaptive expert blending via the volume-aware pixel router.

MoE-GS draws on and extends previous mixture-of-experts work in implicit neural representations (Ben-Shabat et al., 29 Oct 2024), uncertainty-aware motion enhancement (Guo et al., 18 Mar 2024), explicit static/dynamic separation and interpolation strategies (Lee et al., 21 Oct 2024), and hybrid expert routing/gating mechanisms. The router’s volumetric-to-pixel mapping is distinct from prior per-Gaussian or per-pixel gating.

A plausible implication is that gating strategies incorporating optical flow cues, region complexity, or frequency analysis (Guo et al., 18 Mar 2024, Zhou et al., 7 Aug 2025) could further refine MoE expert specialization. Ongoing research may address challenges of expert transition smoothness, routing stability, and extension to higher-dimensional dynamics.

The modular framework is extensible—new expert models can be added or trained in parallel, and advanced gating functions could leverage learned, scene-dependent features. The distillation strategy presents a pathway to real-time, resource-constrained deployment with MoE-quality reconstructions.

7. Objective Assessment and Open Questions

MoE-GS represents the first mixture-of-experts formulation optimized for dynamic Gaussian splatting. The adaptive expert blending and routing provide robustness to scene and temporal variations previously unattainable with single-expert 3DGS models. Increased model capacity and reduced FPS are inherent limitations of the MoE design, but efficiency strategies and distillation mitigate these concerns.

Open questions include optimizing the router for minimal expert boundary artifacts, scaling to larger expert ensembles with hierarchical gating, and rigorous exploration of the trade-off between routing complexity and reconstruction gain. The generalizability of MoE-GS to extreme dynamic scenes, occluded geometry, or sparse data remains an active research direction.

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Mixture of Experts for Dynamic Gaussian Splatting (MoE-GS).