LRDUN: Low-Rank Deep Unfolding for SCI
- LRDUN is a deep unfolding network that incorporates low-rank decomposition to reconstruct high-dimensional hyperspectral images from compressed measurements.
- It alternates between optimizing compact spectral and spatial representations using proximal gradient descent, significantly reducing computational burden.
- The framework leverages a Generalized Feature Unfolding Mechanism to decouple the physical rank from feature dimensionality, enhancing reconstruction accuracy.
A Low-Rank Deep Unfolding Network (LRDUN) is an efficient architecture for spectral compressive imaging (SCI) reconstruction that integrates low-rank decomposition within the deep unfolding paradigm. In contrast to conventional deep unfolding networks (DUNs) that operate iteratively on the full 3D hyperspectral image (HSI) cube, LRDUN alternates explicit optimization of compact low-dimensional spectral and spatial representations under a proximal gradient descent (PGD) scheme. The network leverages a Generalized Feature Unfolding Mechanism (GFUM) to decouple the physical rank of the forward model from the learned prior’s feature dimensionality, enabling state-of-the-art SCI reconstruction quality with reduced computational and memory burden (Huang et al., 23 Nov 2025).
1. Background and Motivation
Spectral compressive imaging (SCI) seeks to reconstruct high-dimensional HSI data cubes, , from highly compressed 2D coded measurements, . Deep unfolding networks (DUNs) have become the principal framework for this inverse problem due to their ability to combine model-based iterations with learnable priors. However, existing DUNs are typically derived from full-cube imaging models and propagate the 3D HSI at every stage. This direct approach suffers from:
- Ill-posedness: Every iteration must infer unknowns from measurements, especially challenging given the severe under-determinacy inherent in SCI.
- Computational redundancy: Full-cube processing in each stage, often implemented via 3D convolutions or attention, incurs significant floating-point operations (FLOPs) and memory usage.
LRDUN addresses these issues by replacing the full-cube update with explicit low-rank modeling, drastically reducing the problem dimensionality and improving both tractability and accuracy (Huang et al., 23 Nov 2025).
2. Low-Rank Decomposition in SCI
LRDUN introduces a low-rank forward model:
where represents spectral basis vectors, and contains the corresponding subspace images. The recovery task is split into:
- Spectral basis estimation (-problem): Optimize via PGD updates.
- Subspace image estimation (-problem): Optimize via PGD updates, alternating with .
The measurement operators for each subproblem are:
- For -problem:
- For -problem:
This decomposition cuts the number of unknowns from to , significantly mitigating the ill-posedness of SCI (Huang et al., 23 Nov 2025).
3. Unfolded Proximal Gradient Descent Workflow
The optimization is unfolded into a trainable network structure, alternating e- and a-updates at each stage:
- -update:
- -update:
Each proximal mapping is implemented as a learnable “ProxyNet” module. In the vanilla setting, these operate on -dimensional feature spaces , which severely limits the expressive power of the learned priors in modeling non-trivial spatial-spectral dependencies required for top-tier reconstruction (Huang et al., 23 Nov 2025).
4. Generalized Feature Unfolding Mechanism (GFUM)
GFUM is central to LRDUN’s efficiency and accuracy. It decouples:
- Physical rank : Controls the data fidelity, tied to the physical subspace constraint.
- Feature dimension : Sets the learnable feature capacity in the prior modules, with .
At each stage, the feature tensors are split:
- The first channels are physical (), the remaining are auxiliary ().
PGD updates are applied solely to physical channels, while auxiliary channels are propagated unchanged. The updated feature is concatenated and passed through a ProxyNet:
This structure enables:
- Rich priors: ProxyNets operate on a -dimensional space, hence can capture complex relations, as .
- Flexible rank configuration: can be selected for optimal physical modeling without restricting prior capacity.
Empirical results show that increasing (for fixed ) boosts PSNR up to a plateau, while gives poorer reconstructions. For instance, with , raising from 11 to 16 improves PSNR by dB (e.g., from dB to dB), while the computation increases only moderately (Huang et al., 23 Nov 2025).
5. Architectural Implementation and Hyperparameter Choices
The typical LRDUN block for each subproblem involves:
- Reshape/slice: Separate the feature tensor (e.g., for ) into physical ( channels) and auxiliary ().
- Data-fidelity block: Apply the physical imaging operator (e.g., ) only to the physical channels via matrix multiplications or small convolutions.
- Concatenate: Merge updated physical and untouched auxiliary channels.
- ProxyNet: Employ a 1D CNN for , or a U-Net with spatial convolutional attention blocks (SCAB) for .
Pseudocode for one -problem stage:
1 2 3 4 5 6 7 |
E_phys = E_feat[:, :k] # B×k E_aux = E_feat[:, k:] # B×(C−k) grad = Phi_Ai.T @ (Phi_Ai @ E_phys - y) # B×k E_phys_half = E_phys - rho_e * grad E_aux_half = E_aux E_feat_half = concatenate(E_phys_half, E_aux_half, axis=1) # B×C E_feat_next = ProxyNet_E(E_feat_half) # B×C |
Hyperparameter studies indicate:
- Optimal matches the true spectral rank (e.g., for KAIST/CAVE).
- should be set with at least 30–50% of to expose a substantial auxiliary pathway.
If , the improvement from auxiliary channels vanishes; if is too small, the physics is underfit.
6. Empirical Performance and Feature Analysis
LRDUN achieves state-of-the-art SCI reconstruction performance:
- Experiments on simulated and real datasets show high PSNR and SSIM with notably reduced computational cost relative to full-cube DUNs.
- Empirical ablation confirms that adding GFUM (increasing ) produces a dB PSNR gain on 3-stage LRDUN at only a modest FLOP increase.
- Visualization of feature channels reveals that physical channels capture dominant structures and material signatures, whereas auxiliary channels encode finer details, background texture, or mask-dependent context.
GFUM’s role as an information carrier is further evidenced by preservation of noise residuals, mask cues, and adaptive step-size representations in the auxiliary features (Huang et al., 23 Nov 2025).
7. Extensions, Design Considerations, and Adaptations
Best practices for adopting LRDUN and GFUM include:
- For other inverse problems (e.g., video SCI, dynamic MRI), identify an analogous low-dimensional physical subspace and then augment with auxiliary feature dimensions for the prior.
- Ensure is at least 30–50% of for effective use of the auxiliary pathway.
- The GFUM mechanism is orthogonal to the ProxyNet architecture; one could substitute U-Nets with transformers and retain the benefits of decoupled feature capacity versus physics rank.
A plausible implication is that LRDUN’s framework generalizes beyond SCI, providing a template for efficient, interpretable, and high-capacity deep unfolding in a broad range of high-dimensional inverse problems (Huang et al., 23 Nov 2025).