PAGaS: Pixel-Aligned 1DoF Gaussian Splatting for Depth Refinement

Published 24 Apr 2026 in cs.CV and cs.RO | (2604.22129v1)

Abstract: Gaussian Splatting (GS) has emerged as an efficient approach for high-quality novel view synthesis. While early GS variants struggled to accurately model the scene's geometry, recent advancements constraining the Gaussians' spread and shapes, such as 2D Gaussian Splatting, have significantly improved geometric fidelity. In this paper, we present Pixel-Aligned 1DoF Gaussian Splatting (PAGaS) that adapts the GS representation from novel view synthesis to the multi-view stereo depth task. Our key contribution is modeling a pixel's depth using one-degree-of-freedom (1DoF) Gaussians that remain tightly constrained during optimization. Unlike existing approaches, our Gaussians' positions and sizes are restricted by the back-projected pixel volumes, leaving depth as the sole degree of freedom to optimize. PAGaS produces highly detailed depths, as illustrated in Figure 1. We quantitatively validate these improvements on top of reference geometric and learning-based multi-view stereo baselines on challenging 3D reconstruction benchmarks. Code: davidrecasens.github.io/pagas

Abstract PDF Upgrade to Chat

Authors (7)

Summary

The paper introduces a novel method where each Gaussian’s only optimizable parameter is depth, reducing complexity from 59 to 1.
It employs pixel-aligned Gaussians with fixed color and analytical scale, enabling precise refinement of high-frequency surface details in multi-view setups.
The occlusion-aware rasterizer and sequential optimization strategy deliver measurable quantitative gains and detailed qualitative improvements over existing methods.

Motivation and Context

Gaussian Splatting (GS) has gained prominence as an explicit, efficient alternative to Neural Radiance Fields (NeRFs) for novel-view synthesis, but substantial challenges persist in achieving accurate surface geometry from GS. Standard photometric losses typically induce overfitting, resulting in overpopulated Gaussian clouds with excessive transparency and non-geometric fidelity. Existing solutions, such as 2DGS and PGSR, partially address these issues by constraining Gaussian parameters; however, most GS-based geometry pipelines optimize a global set of Gaussians with multiple degrees of freedom, limiting their capacity for fine-grained surface detail.

"PAGaS: Pixel-Aligned 1DoF Gaussian Splatting for Depth Refinement" (2604.22129) redefines depth map refinement by minimizing optimizable parameters per Gaussian. Each Gaussian is fully pixel-aligned, with only its depth along the back-projected ray as the sole degree of freedom. The approach targets high-resolution, multi-view stereo tasks, emphasizing scalability and preservation of image fidelity—critical for high-fidelity 3D reconstruction pipelines.

Methodology

PAGaS introduces a fundamentally minimalist GS parametrization:

Pixel-Aligned Gaussians: For each valid pixel in a target view, a single Gaussian is initialized along its camera-to-pixel ray at the position corresponding to the initial coarse depth. The Gaussian is spherical with scale analytically determined to fill its target pixel, trivializing rotation and minimizing overlap perturbation.
Parameter Reduction: All parameters (mean, opacity, scale, rotation) are conditioned on depth and analytically set or fixed. Depth remains the only optimizable parameter (reducing the per-Gaussian optimization count from 59 to 1).
Appearance and Opacity: Gaussian color is locked to observed pixel color, and opacity is fixed at 1, precluding view-dependent effects and further reducing degrees of freedom.
Sequential Optimization: Each view is refined independently, using N context views for photometric constraints. The refined depth map is fused across views using TSDF for global consistency.
Occlusion-Aware Rasterizer: PAGaS proposes a new rasterization logic where radius and depth thresholds are applied to ignore Gaussians that should be occluded during alpha-blending, preventing erroneous blending of out-of-surface Gaussians.
Figure 1: Depth refinement with Pixel-Aligned Gaussian Splatting; Gaussians propagate only along their back-projected rays with analytical scale and fixed color.

Figure 2: Occlusion and disocclusion handling; radius and depth thresholds exclude occluded and disoccluded Gaussians in non-target views.

Figure 3: Occlusion-Aware 3DGS Rasterizer logic; truncated cone volumetric filtering in pixel and depth spaces.

Loss functions combine multi-view photometric consistency ( $L_1$ and SSIM losses with masking for disocclusions) and normal smoothness (encouraging spatial normal coherence only in low color-gradient regions) for robust convergence. Optimization proceeds in a resolution pyramid, beginning at coarse scales and refining up to full resolution.

Experimental Results

The evaluation spans DTU, Tanks-and-Temples (TnT), ActorsHQ, and BlendedMVS. Metrics include Chamfer distance (DTU) and F1-score (TnT). PAGaS is applied as a post-processing depth refiner to multiple baselines: MVSAnywhere (learning-based), 2DGS, and PGSR (optimization-based).

Quantitative Gains: PAGaS consistently improves global metrics for most baselines, with especially strong gains for compact scenes (DTU) where finer voxel sizes are feasible and high-frequency detail is preserved.
Qualitative Enhancement: Normal maps and 3D meshes demonstrate pronounced fine-grained surface detail after PAGaS refinement—details missed by both geometry-driven and learning-based methods.
Resource Efficiency: PAGaS is capable of refining depths at full image resolution with modest computational requirements and no need for pretraining or Structure-from-Motion initialization.
Figure 4: Refinements by PAGaS; normals and meshes exhibit enhanced surface detail across small and large scenes.

Figure 5: ActorsHQ meshes and normals; PAGaS recovers micro-geometry at eyelashes, fabric, and sandals sole.

Discussion and Implications

PAGaS is distinguished by its ability to scale with image resolution and camera count—crucial for dense multi-view capture modalities (e.g., ActorsHQ with 160 views at 2990×4088 px). The radical reduction in optimization variables enables efficient, memory-conscious refinement of high-frequency geometry, addressing a persistent gap in the literature: the loss of pixel-scale surface structure in global or learned MVS reconstructions.

The Occlusion-Aware Rasterizer is a significant structural innovation, allowing pixel-level ambiguity filtering without increasing Gaussian population or complexity. Ablation studies confirm that constraining optimization to strictly depth, analytically scaling for pixel coverage, and enforcing opacity and fixed color are necessary for convergence and detail fidelity. Introducing additional parameters (e.g., view-dependent color) yields marginal accuracy gains but substantially increases resource demands.

Potential limitations include sensitivity to exposure-inconsistent illumination and inability to handle semi-transparency or view-dependent appearance. Addressing these issues would require relaxing pixel alignment and introducing additional appearance modeling, inherently raising computational costs.

Future Directions

PAGaS serves as a modular refinement post-process within existing 3D reconstruction pipelines, and its methodology could be extended toward:

Hybrid Data-Driven Splatting: Integrating learned priors for appearance or transparency modeling, while preserving constrained geometry optimization.
Dynamic Scene Adaptation: Expanding pixel-aligned depth refinement for reconstructing temporally-varying scenes.
Semi-Transparent Surface Modeling: Introducing controlled opacity optimization for non-opaque surfaces, balancing geometric and photometric constraints.
Global Consistency Fusion: Developing pipelines that couple local refinement with global geometric regularization in large-scale outdoor environments.

Conclusion

Pixel-Aligned 1DoF Gaussian Splatting (PAGaS) establishes a new paradigm for post-processing depth refinement. By analytically fixing all but depth parameters and aligning each Gaussian strictly with its pixel, PAGaS efficiently recovers high-frequency surface details missed by global and learned MVS methods, while maintaining scalability and practicality. The Occlusion-Aware Rasterizer further enables precise alpha-blending, eliminating out-of-surface artifacts. PAGaS' principles may inform future work in scalable, high-fidelity 3D reconstruction, especially in high-resolution, densely captured setups.