AR3D-R1: Advances in 3D Imaging

Updated 12 December 2025

AR3D-R1 is a framework that unifies unsupervised ring-artifact reduction in CBCT, RED-based array SAR imaging, and neural relightable 3D reconstruction.
It leverages principled inverse problem formulations and data-driven regularizers to achieve high fidelity metrics like PSNR and SSIM across various domains.
The practical applications span medical imaging, remote sensing, and computer graphics, demonstrating robust performance under sparse and noisy conditions.

AR3D-R1 refers to state-of-the-art approaches in three distinct domains: (1) unsupervised ring-artifact reduction in 3D X-ray CBCT, (2) array SAR 3D sparse imaging based on Regularization by Denoising (RED), and (3) neural relightable 3D appearance reconstruction. Each instantiation of AR3D-R1 addresses critical challenges in high-dimensional imaging via principled inverse problem formulations and data-driven regularization methodologies. The following sections synthesize the technical foundations, algorithms, performance characteristics, and limitations across these recent works (Wu et al., 8 Dec 2024, Wang et al., 9 May 2024, Feng et al., 16 Nov 2024).

1. Multi-Parameter Inverse Problem in 3D X-ray CBCT

AR3D-R1, also termed "Riner", reframes ring artifact reduction in 3D cone-beam CT as a multi-parameter inverse problem centered on the physical model of detector response nonidealities. Measurements are described by the discretized Lambert–Beer law,

$I(\theta,s) = \alpha_s I_0 \exp\left(-\int_{L(\theta,s)} \mu(x)\,dx\right),$

where $\mu(x)$ is the clean attenuation field and $\alpha_s$ the per-detector response. The forward model incorporates both valid and defective detectors using a binary mask $m(s)$ , yielding a sinogram entry,

$\rho(\theta,s) = \left[-\ln(\max(\alpha_s,0)+\epsilon_{\text{const}}) + \sum_{x\in L(\theta,s)} \mu(x)\,\Delta x \right] \cdot m(s).$

The inverse objective jointly estimates the implicit neural field $\mu(x)$ (an MLP encoded via Instant-NGP hash grids) and $\alpha = [\alpha_1, \dots, \alpha_S]$ by minimizing

$L(\Phi,\alpha) = \sum_{\theta\in\Theta} \sum_{s\in S_{\text{sub}}} \left\|\rho(\theta, s) - \hat\rho(\theta, s;\Phi, \alpha) \right\|_1,$

without explicit regularization, relying instead on the spectral bias of neural fields. Mini-batch ray-based optimization scales linearly with the number of rays and samples, facilitating memory-efficient joint inference over large 3D volumes with no external training data (Wu et al., 8 Dec 2024).

2. Regularization by Denoising in 3D Array SAR Imaging

AR3D-R1 also designates an array SAR 3D sparse imaging framework leveraging RED, which substitutes traditional handcrafted priors with explicit state-of-the-art denoising operators. The SAR forward model is

$y = A x + n,$

where $A$ is the measurement operator encapsulating spatial phase delays. The RED cost function is

$J(x) = \frac{1}{2} \|Ax - y\|_2^2 + \frac{\lambda}{2} x^T(x - D(x)),$

with $D(\cdot)$ a denoiser such as NLM, BM3D, DnCNN, or IRCNN. Two proximal-gradient-type solvers are employed:

RED-ADMM (RADMM): Alternately updates $x$ via linear solves and $v$ via denoising-based fixed-point iterations, with dual variable updates for convergence.
RED-GAP (RGAP): Applies explicit data-consistency projections and view-pooling.

Under conditions where $D$ is cyclically-nonexpansive, theoretical guarantees ensure convexity and convergence. Experimental benchmarks demonstrate superior quantitative fidelity (e.g., $48.2$ dB PSNR, $0.976$ SSIM at $50\%$ sampling rate) and robustness to severe undersampling and noise, outperforming non-learning and plug-and-play baselines (Wang et al., 9 May 2024).

3. Neural Relightable 3D Appearance Reconstruction

In the context of sparse-view 3D appearance reconstruction, AR3D-R1 architectures enable explicit decoupling of geometry and appearance to solve for relightable, physically-based rendering (PBR) maps over UV space. The ARM pipeline comprises:

GeoRM: Transformer-triplane feature extraction and MLP density decoding for geometry, followed by differentiable Marching Cubes mesh extraction.
GlossyRM: Predicts per-vertex roughness and metalness on fixed meshes.
InstantAlbedo: Fuses six back-projected measurement UV maps via U-Net and FFC (Fast Fourier Convolution) modules, outputting both baked-lighting color and diffuse albedo.

Disentanglement of illumination vs. material properties is achieved by integrating a material-aware encoder (DINO ViT, pretrained on segmentation datasets), which is back-projected into UV space alongside raw colors to inform the network. Optimization exploits multi-scale semantic cues to suppress baked-in highlights and enhance robustness under sparse observations (Feng et al., 16 Nov 2024).

4. Experimental Evaluation and Key Performance Metrics

Rigorous empirical comparisons substantiate the efficacy of AR3D-R1 methodologies:

In ring-artifact reduction, AR3D-R1 achieves $38.93$ dB PSNR and $0.965$ SSIM on DeepLesion test slices, surpassing both supervised and unsupervised SOTA baselines (Wu et al., 8 Dec 2024).
For array SAR imaging, RED-based approaches yield up to $+4$ dB PSNR improvement over matched filter or convex priors, with stable artifact suppression at extreme undersampling (SR $= 15\%$ ) and low SNR (Wang et al., 9 May 2024).
For relightable 3D reconstruction, ARM achieves $0.968$ F-Score, $0.049$ Chamfer Distance, $21.69$ dB PSNR, and $0.880$ SSIM—outperforming MeshFormer and others. Relighted images maintain $21.750$ dB PSNR-A and $0.171$ LPIPS-A (Feng et al., 16 Nov 2024).

A table summarizing core metrics across domains:

Domain	Key Metric	AR3D-R1 Performance
3D X-ray CBCT (RAR)	PSNR [dB], SSIM	38.93, 0.965
Array SAR 3D Imaging	PSNR [dB], SSIM	48.2, 0.976
Relightable 3D Gen.	F-Score, PSNR, SSIM	0.968, 21.69, 0.880

5. Scalability, Generalization, and Algorithmic Limitations

AR3D-R1 frameworks are designed with scalability and generalization in mind:

CBCT RAR generalizes across both fan-beam and cone-beam geometries and diverse detector types without paired training data, leveraging the spectral bias of neural implicit fields to regularize ill-posedness (Wu et al., 8 Dec 2024).
Array SAR RED imaging is robust to high-dimensional data, few observations, and noise due to adaptive denoising priors and operator-splitting solvers with provable convergence (Wang et al., 9 May 2024).
ARM-based 3D appearance reconstruction isolates geometry and appearance learning, but faces potential inconsistencies in upstream multi-view synthesis and discrete atlas unwrapping artifacts (Feng et al., 16 Nov 2024).

Remaining challenges include per-case optimization overhead (e.g., $\sim15$ min/volume for CBCT), lack of explicit regularizers for detector responses, opportunities for algorithmic acceleration (e.g., K-planes, splatting), and material segmentation reliability.

6. Future Directions and Research Opportunities

Emergent AR3D-R1 methods prompt several research avenues:

For ring artifact reduction: integrating explicit regularizers on detector responses, jointly modeling measurement noise, and extending inverse solvers to time-varying or spectral CT.
For array SAR imaging: designing adaptive denoiser selection strategies, exploring deeper CNN models, and optimizing penalty parameters for convergence speed vs. reconstruction fidelity.
For relightable 3D reconstruction: joint refinement of unwrapping and texture inference, learnable view aggregation for conflict resolution, and incorporation of real multi-illumination datasets for enhanced priors.

A plausible implication is that the multi-parameter inverse problem paradigm, when integrated with neural representations and explicit denoising-based regularizers, can generalize to other volumetric, imaging, or inverse rendering tasks in scientific and industrial domains.