FixingGS: Enhancing Sparse-View 3DGS Reconstruction

Updated 24 September 2025

FixingGS is a training-free framework that enhances sparse-view 3D Gaussian Splatting reconstructions using continuous score distillation from a pre-trained diffusion model.
It applies an adaptive progressive enhancement scheme to target unreliable regions, ensuring cross-view coherence and effective artifact inpainting.
The method achieves higher PSNR and SSIM with lower LPIPS compared to prior approaches, demonstrating robust removal of view-dependent artifacts.

FixingGS is a training-free framework for enhancing 3D Gaussian Splatting (3DGS) reconstructions, targeting the persistent artifacts and incompleteness induced by extremely sparse viewpoint sampling. Central to FixingGS is a continuous score distillation mechanism that leverages a pre-trained diffusion model to provide cross-view coherent guidance, combined with an adaptive progressive enhancement process that further addresses under-constrained regions. The approach enables effective and automated inpainting as well as artifact removal, producing higher-fidelity, visually consistent 3D reconstructed scenes—even in the challenging regime of minimal input views (Wang et al., 23 Sep 2025).

1. Challenges of Sparse-View 3DGS and Artifact Formation

3D Gaussian Splatting remains a leading method for efficient 3D scene reconstruction and novel view synthesis, especially because of its explicit and differentiable nature. However, when the input comprises only sparse camera viewpoints, the underlying scene geometry and appearance are severely underconstrained. The optimization overfits the observed views: artifacts such as view-dependent noise, blurred regions, or implausible geometry appear, particularly in disoccluded or unobserved parts of the scene. Previous efforts to regularize 3DGS in this regime have employed generative priors, often via 2D image diffusion models or by incorporating patch-based or text-conditioned inpainting. While such techniques can fill missing content or remove noise, they typically struggle to maintain multi-view consistency, often causing geometric ambiguity and diminished structural detail. These issues are intrinsic to the separation between 2D prior guidance and the globally coupled 3D scene representation.

2. Continuous, Training-Free Score Distillation: Methodological Core

FixingGS introduces a continuous, training-free score distillation process to address these limitations. The method builds on concepts from Score Distillation Sampling (SDS) but departs from earlier approaches that only periodically update diffusion priors or require fine-tuning of generative models. In FixingGS, a differentiable rendering operator $g(\theta, c)$ produces a 2D image for a virtual camera pose $c$ , given current 3DGS parameters $\theta$ . For each such "extra view," the framework computes the image-residual distillation loss

$L_{\mathrm{distillation}} = \bigg\| \omega(t_0) \cdot \Big(g(\theta, c) - \mathcal{D}_\phi(g(\theta, c); t_0, y) \Big) \bigg\|_2^2$

where $\mathcal{D}_\phi(\cdot)$ is a fixed, pre-trained diffusion model applied at timestep $t_0$ (set to 199), with conditioning $y$ derived from reference images, and $\omega(t_0) = 0.5$ . This direct image residual formulation (contrast: DreamFusion’s noise residual) allows for more stable, consistently informative gradients. The continuous nature of this process—i.e., applying updated diffusion constraints throughout the optimization—mitigates stale or incoherent guidance, ensuring that the evolving 3DGS parameters are always aligned with the generative prior.

3. Adaptive Progressive Enhancement in Under-Constrained Regions

Despite the advantages of continuous distillation, the diffusion prior’s reliability degrades as the rendered view diverges from those present in the training set. Substantial view shifts can cause the diffusion model to deliver unreliable or hallucinated corrections. FixingGS therefore introduces an Adaptive Progressive Enhancement (APE) scheme:

For each extra (under-constrained) camera view, it measures the Peak Signal-to-Noise Ratio (PSNR) between the current rendering and its diffusion-guided "fixed" version.
If PSNR $<$ 25 dB (threshold $\eta$ ), the view is considered unreliable.
For each such view, three observed training images close in pose (using combined translation and rotation distance) are selected.
Using a pose-shifting operator, these reference views are moved toward the unreliable target viewpoint; their renders are enhanced via the diffusion model to expand effective coverage.
The improved images are integrated into the training set, recursively refining the reconstruction in problematic regions.

This protocol adaptively densifies the supervision in areas where the diffusion prior is most uncertain, leading to sharper, less ambiguous synthesis for extreme novel views.

4. Relation to Prior Generative Prior Methods and Advantages

Earlier approaches for hallucinating missing or unreliable views in 3DGS include distillation pipelines built around generative diffusion models, such as GenFusion or Difix3D. These typically employ periodic updates of priors or require task/scene-specific fine-tuning. FixingGS bypasses those limitations by being completely training-free with respect to the generative model: the diffusion model is never updated or retrained; rather, its knowledge is distilled into the scene renderings in a continuous, optimization-loop-integrated way. This results in diffusion guidance that is persistently relevant to the evolving 3DGS parameters and avoids the lag or inconsistency seen in approaches where generative priors are static or infrequently synchronized.

Moreover, the APE protocol represents a unique innovation in supervised coverage—unreliable regions are not simply post-processed but are triaged and revisited with shifted, gradually more relevant synthetic views, all of which are regularized via the diffusion prior.

5. Experimental Validation and Performance

FixingGS is evaluated with extensive quantitative and qualitative metrics on standard benchmarks including DL3DV-10K and Mip-NeRF 360. Across all baselines—traditional 3DGS, FSGS, GenFusion, Difix3D—FixingGS achieves

Higher PSNR and SSIM (improved fidelity and structural similarity)
Lower LPIPS (superior perceptual quality)
More distinct high-frequency details and reduction in view-dependent artifacts

Visual inspection highlights effective inpainting and coherent structure across challenging, sparsely observed scenes. Ablation studies confirm that both (i) the continuous, image-residual-based distillation and (ii) the adaptive progressive enhancement are jointly required for optimal novel view synthesis.

Method	PSNR ↑	SSIM ↑	LPIPS ↓	Hallucinated Detail	Consistency across Views
3DGS	Low	Low	High	Low	Weak
GenFusion/Difix3D	Mid	Mid	Mid	Moderate	Moderate
FixingGS	High	High	Low	High	Strong

6. Limitations and Future Directions

A noted limitation of FixingGS is that, despite improved overall coherence, areas with highly divergent novel viewpoints or extreme occlusions can still receive unreliable diffusion guidance in early optimization phases. The adaptive progressive enhancement mitigates, but does not eliminate, the risk of local oversmoothing or hallucination in extremely under-constrained cases. Future work is proposed in expanding to more specialized diffusion models (potentially domain-specific text/image priors), further optimizing the distillation process, and generalizing the method to other explicit 3D representations. A plausible implication is that integrating semantic or depth conditioning into the diffusion prior could improve artifact repair in ambiguous geometric regions.

7. Significance and Impact

FixingGS establishes a new paradigm for artifact repair and completion in sparse-view 3D scene reconstruction, demonstrating that pretrained diffusion priors can be continuously distilled into explicit 3D representations without retraining or periodic prior refresh. It enables cross-view coherent, high-fidelity reconstructions and sets a new standard in practical, training-free enhancement for Gaussian Splatting pipelines. The methodology is applicable to a broad range of real-world and research scenarios, especially those involving limited input data or where training resources for generative models are unavailable.

PDF Markdown Chat (Pro)

References (1)

FixingGS: Enhancing 3D Gaussian Splatting via Training-Free Score Distillation (2025)

FixingGS: Enhancing Sparse-View 3DGS Reconstruction

1. Challenges of Sparse-View 3DGS and Artifact Formation

2. Continuous, Training-Free Score Distillation: Methodological Core

3. Adaptive Progressive Enhancement in Under-Constrained Regions

4. Relation to Prior Generative Prior Methods and Advantages

5. Experimental Validation and Performance

6. Limitations and Future Directions

7. Significance and Impact

Whiteboard

Follow Topic

Continue Learning

FixingGS: Enhancing Sparse-View 3DGS Reconstruction

1. Challenges of Sparse-View 3DGS and Artifact Formation

2. Continuous, Training-Free Score Distillation: Methodological Core

3. Adaptive Progressive Enhancement in Under-Constrained Regions

4. Relation to Prior Generative Prior Methods and Advantages

5. Experimental Validation and Performance

6. Limitations and Future Directions

7. Significance and Impact

Sponsor

Whiteboard

Follow Topic

Continue Learning

Related Topics