Papers
Topics
Authors
Recent
2000 character limit reached

LRDUN: Low-Rank Deep Unfolding for SCI

Updated 30 November 2025
  • LRDUN is a deep unfolding network that incorporates low-rank decomposition to reconstruct high-dimensional hyperspectral images from compressed measurements.
  • It alternates between optimizing compact spectral and spatial representations using proximal gradient descent, significantly reducing computational burden.
  • The framework leverages a Generalized Feature Unfolding Mechanism to decouple the physical rank from feature dimensionality, enhancing reconstruction accuracy.

A Low-Rank Deep Unfolding Network (LRDUN) is an efficient architecture for spectral compressive imaging (SCI) reconstruction that integrates low-rank decomposition within the deep unfolding paradigm. In contrast to conventional deep unfolding networks (DUNs) that operate iteratively on the full 3D hyperspectral image (HSI) cube, LRDUN alternates explicit optimization of compact low-dimensional spectral and spatial representations under a proximal gradient descent (PGD) scheme. The network leverages a Generalized Feature Unfolding Mechanism (GFUM) to decouple the physical rank of the forward model from the learned prior’s feature dimensionality, enabling state-of-the-art SCI reconstruction quality with reduced computational and memory burden (Huang et al., 23 Nov 2025).

1. Background and Motivation

Spectral compressive imaging (SCI) seeks to reconstruct high-dimensional HSI data cubes, XRH×W×BX \in \mathbb{R}^{H \times W \times B}, from highly compressed 2D coded measurements, YRH×WY \in \mathbb{R}^{H \times W'}. Deep unfolding networks (DUNs) have become the principal framework for this inverse problem due to their ability to combine model-based iterations with learnable priors. However, existing DUNs are typically derived from full-cube imaging models and propagate the 3D HSI at every stage. This direct approach suffers from:

  • Ill-posedness: Every iteration must infer HWBH \cdot W \cdot B unknowns from HWH \cdot W' measurements, especially challenging given the severe under-determinacy inherent in SCI.
  • Computational redundancy: Full-cube processing in each stage, often implemented via 3D convolutions or attention, incurs significant floating-point operations (FLOPs) and memory usage.

LRDUN addresses these issues by replacing the full-cube update with explicit low-rank modeling, drastically reducing the problem dimensionality and improving both tractability and accuracy (Huang et al., 23 Nov 2025).

2. Low-Rank Decomposition in SCI

LRDUN introduces a low-rank forward model:

X=A×3EX = A \times_3 E

where ERB×kE \in \mathbb{R}^{B \times k} represents kBk \ll B spectral basis vectors, and ARHW×kA \in \mathbb{R}^{HW \times k} contains the corresponding kk subspace images. The recovery task is split into:

  • Spectral basis estimation (EE-problem): Optimize EE via PGD updates.
  • Subspace image estimation (AA-problem): Optimize AA via PGD updates, alternating with EE.

The measurement operators for each subproblem are:

  • For EE-problem: ΦAi=Φ(IBAi)RHW×Bk\Phi_{A^i} = \Phi \cdot (I_B \otimes A^i) \in \mathbb{R}^{HW' \times Bk}
  • For AA-problem: ΦEi+1=Φ(Ei+1IHW)RHW×(kHW)\Phi_{E^{i+1}} = \Phi \cdot (E^{i+1} \otimes I_{HW}) \in \mathbb{R}^{HW' \times (kHW)}

This decomposition cuts the number of unknowns from HWBH \cdot W \cdot B to k(HW+B)k \cdot (H \cdot W + B), significantly mitigating the ill-posedness of SCI (Huang et al., 23 Nov 2025).

3. Unfolded Proximal Gradient Descent Workflow

The optimization is unfolded into a trainable network structure, alternating e- and a-updates at each stage:

  • EE-update:

ei+1=proxλe,ρe(eiρeΦAi(ΦAieiy))e^{i+1} = \operatorname{prox}_{\lambda_e, \rho_e} (e^i - \rho_e \Phi_{A^i}^\top(\Phi_{A^i} e^i - y))

  • AA-update:

ai+1=proxλa,ρa(aiρaΦEi+1(ΦEi+1aiy))a^{i+1} = \operatorname{prox}_{\lambda_a, \rho_a} (a^i - \rho_a \Phi_{E^{i+1}}^\top(\Phi_{E^{i+1}} a^i - y))

Each proximal mapping is implemented as a learnable “ProxyNet” module. In the vanilla setting, these operate on kk-dimensional feature spaces (k1020)(k \sim 10-20), which severely limits the expressive power of the learned priors in modeling non-trivial spatial-spectral dependencies required for top-tier reconstruction (Huang et al., 23 Nov 2025).

4. Generalized Feature Unfolding Mechanism (GFUM)

GFUM is central to LRDUN’s efficiency and accuracy. It decouples:

  • Physical rank kk: Controls the data fidelity, tied to the physical subspace constraint.
  • Feature dimension CC: Sets the learnable feature capacity in the prior modules, with CkC \geq k.

At each stage, the feature tensors are split:

  • EfeatiRB×C=[Ephysi  Eauxi]E_{\text{feat}}^{i} \in \mathbb{R}^{B \times C} = [ E_{\text{phys}}^{i} \; E_{\text{aux}}^{i} ]
  • The first kk channels are physical (EphysiE_{\text{phys}}^i), the remaining CkC-k are auxiliary (EauxiE_{\text{aux}}^i).

PGD updates are applied solely to physical channels, while auxiliary channels are propagated unchanged. The updated feature is concatenated and passed through a ProxyNet:

  • Efeati+1=ProxyNete([Ephysi+1/2;Eauxi+1/2])E_{\text{feat}}^{i+1} = \text{ProxyNet}_e([E_{\text{phys}}^{i+1/2}; E_{\text{aux}}^{i+1/2}])

This structure enables:

  • Rich priors: ProxyNets operate on a CC-dimensional space, hence can capture complex relations, as CkC \gg k.
  • Flexible rank configuration: kk can be selected for optimal physical modeling without restricting prior capacity.

Empirical results show that increasing CC (for fixed kk) boosts PSNR up to a plateau, while C=kC=k gives poorer reconstructions. For instance, with k=11k=11, raising CC from 11 to 16 improves PSNR by >1>1 dB (e.g., from 38.2\approx 38.2 dB to 39.4\approx 39.4 dB), while the computation increases only moderately (Huang et al., 23 Nov 2025).

5. Architectural Implementation and Hyperparameter Choices

The typical LRDUN block for each subproblem involves:

  1. Reshape/slice: Separate the feature tensor (e.g., B×CB \times C for EE) into physical (0k10\ldots k-1 channels) and auxiliary (kC1k\ldots C-1).
  2. Data-fidelity block: Apply the physical imaging operator (e.g., ΦAi\Phi_{A^i}) only to the physical channels via matrix multiplications or small convolutions.
  3. Concatenate: Merge updated physical and untouched auxiliary channels.
  4. ProxyNet: Employ a 1D CNN for EE, or a U-Net with spatial convolutional attention blocks (SCAB) for AA.

Pseudocode for one EE-problem stage:

1
2
3
4
5
6
7
E_phys = E_feat[:, :k]       # B×k
E_aux  = E_feat[:, k:]       # B×(C−k)
grad = Phi_Ai.T @ (Phi_Ai @ E_phys - y)  # B×k
E_phys_half = E_phys - rho_e * grad
E_aux_half  = E_aux
E_feat_half = concatenate(E_phys_half, E_aux_half, axis=1)  # B×C
E_feat_next = ProxyNet_E(E_feat_half)  # B×C
A-problem is analogous, employing a spatial U-Net as ProxyNet (Huang et al., 23 Nov 2025).

Hyperparameter studies indicate:

  • Optimal kk matches the true spectral rank (e.g., k=11k=11 for KAIST/CAVE).
  • CC should be set with CkC-k at least 30–50% of CC to expose a substantial auxiliary pathway.

If kCk \rightarrow C, the improvement from auxiliary channels vanishes; if kk is too small, the physics is underfit.

6. Empirical Performance and Feature Analysis

LRDUN achieves state-of-the-art SCI reconstruction performance:

  • Experiments on simulated and real datasets show high PSNR and SSIM with notably reduced computational cost relative to full-cube DUNs.
  • Empirical ablation confirms that adding GFUM (increasing C>kC>k) produces a >1>1 dB PSNR gain on 3-stage LRDUN at only a modest FLOP increase.
  • Visualization of feature channels reveals that physical channels capture dominant structures and material signatures, whereas auxiliary channels encode finer details, background texture, or mask-dependent context.

GFUM’s role as an information carrier is further evidenced by preservation of noise residuals, mask cues, and adaptive step-size representations in the auxiliary features (Huang et al., 23 Nov 2025).

7. Extensions, Design Considerations, and Adaptations

Best practices for adopting LRDUN and GFUM include:

  • For other inverse problems (e.g., video SCI, dynamic MRI), identify an analogous low-dimensional physical subspace and then augment with auxiliary feature dimensions for the prior.
  • Ensure CkC - k is at least 30–50% of CC for effective use of the auxiliary pathway.
  • The GFUM mechanism is orthogonal to the ProxyNet architecture; one could substitute U-Nets with transformers and retain the benefits of decoupled feature capacity versus physics rank.

A plausible implication is that LRDUN’s framework generalizes beyond SCI, providing a template for efficient, interpretable, and high-capacity deep unfolding in a broad range of high-dimensional inverse problems (Huang et al., 23 Nov 2025).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Whiteboard

Topic to Video (Beta)

Follow Topic

Get notified by email when new papers are published related to Low-Rank Deep Unfolding Network (LRDUN).