DiffNR: Diffusion-Enhanced Neural Representation Optimization for Sparse-View 3D Tomographic Reconstruction

Published 23 Apr 2026 in eess.IV and cs.CV | (2604.21518v1)

Abstract: Neural representations (NRs), such as neural fields and 3D Gaussians, effectively model volumetric data in computed tomography (CT) but suffer from severe artifacts under sparse-view settings. To address this, we propose DiffNR, a novel framework that enhances NR optimization with diffusion priors. At its core is SliceFixer, a single-step diffusion model designed to correct artifacts in degraded slices. We integrate specialized conditioning layers into the network and develop tailored data curation strategies to support model finetuning. During reconstruction, SliceFixer periodically generates pseudo-reference volumes, providing auxiliary 3D perceptual supervision to fix underconstrained regions. Compared to prior methods that embed CT solvers into time-consuming iterative denoising, our repair-and-augment strategy avoids frequent diffusion model queries, leading to better runtime performance. Extensive experiments show that DiffNR improves PSNR by 3.99 dB on average, generalizes well across domains, and maintains efficient optimization.

Abstract PDF Upgrade to Chat

Authors (5)

Summary

The paper introduces a repair-and-augment paradigm that combines neural representations with a single-step conditioned diffusion model (SliceFixer) for effective artifact correction in sparse-view CT.
It demonstrates significant improvements, with notable PSNR gains and efficient runtime, outperforming traditional iterative reconstruction methods on benchmarks like ToothFairy and LUNA16.
The approach enhances volumetric consistency and downstream segmentation accuracy, paving the way for safer, low-dose CT imaging and hybrid inverse imaging solutions.

DiffNR: Diffusion-Enhanced Neural Representation Optimization for Sparse-View 3D Tomographic Reconstruction

Introduction

DiffNR addresses an acute challenge in sparse-view computed tomography (CT): accurate volumetric reconstruction from limited X-ray projections without sacrificing anatomical fidelity or efficiency. Traditional iterative reconstruction methods or neural representations (NRs) are highly susceptible to artifacts under severe data sparsity. Prior diffusion priors have advanced artifact suppression yet suffer from computational inefficiency and limited volumetric coherence. DiffNR proposes a repair-and-augment paradigm leveraging a single-step conditioned diffusion model (SliceFixer) for artifact correction, inherently improving the optimization and generalization properties of NRs in sparse-view 3D CT.

Figure 1: DiffNR schematic, demonstrating (a) the cone-beam CT geometry, (b) the pipeline overview, and (c) comparison with baseline NR methods.

Methodology

SliceFixer: Conditioned Single-Step Diffusion for Slice Repair

SliceFixer is a single-step diffusion model built on SD-Turbo, adapted via LoRA and zero convolution layers for artifact correction in NR-reconstructed CT slices. Conditioning integrates biplanar X-ray projections and text prompts through RAD-DINO encoders for enhanced global structural guidance. The loss regime includes $L_2$ , LPIPS, CLIP, GAN, and SSIM losses, with SSIM explicitly targeting perceptual improvements and structural coherence.

Figure 2: SliceFixer architecture showing the integration of CT slice latents, projection encodings, and text-based conditioning.

Data Curation and Diversity

SliceFixer finetuning leverages synthetic artifacted and clean slice pairs generated by underfitting various NR backbones (neural fields, 3D Gaussians) with randomized sparse-view distributions. This strategy enforces robustness to diverse artifact patterns and prevents overfitting to specific NR-induced degradations.

DiffNR Optimization Pipeline

DiffNR interleaves conventional NR optimization (with L1, SSIM, and TV losses) and periodic augmentation via SliceFixer, producing pseudo-referenced volumes for regularization. This pipeline mitigates slice jitter and diffusion hallucination by integrating 3D SSIM supervision (across axial, sagittal, and coronal planes) between the current volume and SliceFixer-augmented reference, substantially enhancing volumetric consistency.

Figure 3: DiffNR pipeline showing two-stage optimization, with periodic pseudo-reference generation and SSIM-based perceptual regularization.

Experimental Results

Quantitative and Qualitative Comparisons

DiffNR achieves substantial gains across ToothFairy and LUNA16 benchmarks, improving NR PSNR scores by 2.19 dB (NAF backbone) and 5.79 dB (R $^2$ -Gaussian backbone). On out-of-distribution (OOD) datasets, DiffNR maintains superior artifact suppression and generalization. Notably, DiffNR's runtime is an order of magnitude lower than prior diffusion-based iterative methods, owing to its efficient repair-and-augment design.

Figure 4: Qualitative reconstruction results showing superior detail recovery and artifact suppression from DiffNR on multi-view CT slices.

Figure 5: DiffNR’s qualitative performance on challenging OOD datasets, demonstrating robust artifact correction and anatomical fidelity.

Downstream Task Performance

DiffNR-enhanced volumes facilitate improved downstream medical segmentation, with lung Dice scores up to 93.74 (36-view). The integration of SSIM loss and biplanar projection conditioning in SliceFixer is empirically validated as critical for volumetric quality and segmentation stability.

Figure 6: Downstream lung segmentation, ablation analyses of SliceFixer, and comparison of standalone post-processing versus deep pipeline integration.

Ablation Analyses

Key architectural choices include SliceFixer resolution upsampling, SSIM loss inclusion, and biplanar conditioning—all yielding positive PSNR/SSIM shifts. Integrating SliceFixer directly into NR optimization outperforms standalone post-processing, as shown in ablations. Augmentation via slice supervision, rather than projections, is established as more effective for CT volumes, given the cumulative nature of projection errors.

Implications and Future Directions

Practically, DiffNR enables safer CT imaging by reducing exposure while preserving diagnostically relevant structures. Theoretically, the repair-and-augment strategy offers a generalized pathway for integrating powerful diffusion priors with global neural representations across inverse problems. Its demonstrated generalization across anatomical and OOD datasets, combined with computational efficiency, sets a precedent for future hybrid reconstruction paradigms. Further developments should explore cross-modal conditioning, direct 3D diffusion frameworks, and adaptive data curation tailored to non-medical domains and novel scanning geometries.

Conclusion

DiffNR presents a principled, efficient, and generalizable approach for sparse-view CT reconstruction by bridging neural representations and conditional diffusion priors. Its architecture, data synthesis, and optimization strategies collectively suppress artifacts, enforce volumetric consistency, and accelerate inference without succumbing to hallucination or slice jitter. The repair-and-augment methodology introduced by DiffNR is likely to inspire subsequent research in hybrid optimization and inverse imaging, particularly for regimes where measurement constraints necessitate learned regularization and computational scalability.

Markdown Report Issue