HDNet: High-resolution Dual-domain Learning for Spectral Compressive Imaging (2203.02149v2)

Published 4 Mar 2022 in eess.IV and cs.CV

Abstract: The rapid development of deep learning provides a better solution for the end-to-end reconstruction of hyperspectral image (HSI). However, existing learning-based methods have two major defects. Firstly, networks with self-attention usually sacrifice internal resolution to balance model performance against complexity, losing fine-grained high-resolution (HR) features. Secondly, even if the optimization focusing on spatial-spectral domain learning (SDL) converges to the ideal solution, there is still a significant visual difference between the reconstructed HSI and the truth. Therefore, we propose a high-resolution dual-domain learning network (HDNet) for HSI reconstruction. On the one hand, the proposed HR spatial-spectral attention module with its efficient feature fusion provides continuous and fine pixel-level features. On the other hand, frequency domain learning (FDL) is introduced for HSI reconstruction to narrow the frequency domain discrepancy. Dynamic FDL supervision forces the model to reconstruct fine-grained frequencies and compensate for excessive smoothing and distortion caused by pixel-level losses. The HR pixel-level attention and frequency-level refinement in our HDNet mutually promote HSI perceptual quality. Extensive quantitative and qualitative evaluation experiments show that our method achieves SOTA performance on simulated and real HSI datasets. Code and models will be released at https://github.com/caiyuanhao1998/MST

Citations (117)

View on Semantic Scholar

Summary

The paper introduces an innovative high-resolution spatial-spectral attention module that improves pixel-level feature extraction and preserves fine spectral details.
It employs a frequency domain learning approach with dynamic 2D DFT supervision to address smoothing and frequency discrepancies in hyperspectral images.
Experimental results show that HDNet outperforms competitors with a PSNR of 34.34 and SSIM of 0.9572, benefiting remote sensing, medical imaging, and environmental monitoring.

Analysis of HDNet: High-resolution Dual-domain Learning for Spectral Compressive Imaging

The paper "HDNet: High-resolution Dual-domain Learning for Spectral Compressive Imaging" introduces an advanced framework for hyperspectral image (HSI) reconstruction leveraging a dual-domain learning network, HDNet. This research focuses on addressing limitations in existing learning-based HSI reconstruction methods, particularly those sacrificing resolution in self-attention networks and traditional approaches failing to preserve fine-grained spectral details.

Key Contributions

HDNet offers essential contributions that elevate the state-of-the-art in HSI reconstruction:

High-Resolution Spatial-Spectral Attention: The paper presents a novel spatial-spectral attention module that combines high-resolution (HR) spectral and spatial attention to improve pixel-level feature extraction. It avoids dimensionality collapse associated with conventional self-attention methods by maintaining high internal resolution through efficient feature fusion.
Frequency Domain Learning (FDL): The introduction of FDL represents a significant advancement, addressing the frequency domain discrepancy overlooked by traditional spatial-spectral domain learning (SDL). By employing a dynamic FDL supervision mechanism, HDNet focuses on reconstructing fine-grained frequencies, thus compensating for excessive smoothing and distortion.
Efficient Feature Fusion Mechanism: A grouped depthwise-separable convolution is employed to ensure efficient interaction and utilization of the spectral and spatial attention features, enhancing computational efficiency without increasing complexity.

Methodology

HDNet employs a dual-domain learning approach, integrating both the spatial-spectral domain and frequency domain. ResNet is chosen as the baseline for lightweight implementation, with several optimizations and novel modules:

Spatial-Spectral Domain Learning: An innovative module integrates HR spectral and spatial attention with an efficient feature fusion strategy, maintaining high-resolution inputs and improving model performance at lower parameter costs.
Frequency Domain Learning: Utilizing the 2D Discrete Fourier Transform (DFT), HDNet applies frequency-level supervision, adapting dynamically to the frequency spectrum discrepancies between reconstructed images and ground truth. This approach aligns the reconstructions closely with actual frequency statistics.

Experimental Insights

The comprehensive experiments on benchmark datasets (CAVE, KAIST) demonstrate remarkable improvements:

HDNet surpasses existing methods with an average PSNR of 34.34 and SSIM of 0.9572, outperforming competitors like DGSMP by significant margins.
The qualitative comparisons highlight HDNet's superiority in preserving structural details and consistency across spectral dimensions, which is critical for applications requiring precise spectral information.
FDL has been shown to effectively reduce frequency domain discrepancies, asserting the importance of frequency-focused learning for achieving high perceptual quality in HSI reconstructions.

Implications and Future Directions

The implications of HDNet extend significantly both in practical applications and theoretical research:

Practical Applications: By achieving real-time and high-accuracy HSI reconstruction, HDNet can advance applications in remote sensing, medical imaging, and environmental monitoring, where capturing fine spectral details is essential.
Theoretical Developments: The dual-domain approach opens avenues for future exploration. Potential directions include enhancing dynamic weighting mechanisms within FDL and investigating broader applications involving multi-modal data fusion.

In conclusion, HDNet represents a substantial progression in the domain of HSI reconstruction. By overcoming the limitations of existing spatial-spectral methods and introducing robust frequency domain learning, it achieves state-of-the-art performance. This paper not only contributes to the current methodological landscape but also sets a foundation for future exploration in dual-domain learning frameworks.

PDF Markdown

Related Papers

GitHub

GitHub - caiyuanhao1998/MST: A toolbox for spectral compressive imaging reconstruction including MST (CVPR 2022), CST (ECCV 2022), DAUHST (NeurIPS 2022), BiSCI (NeurIPS 2023), HDNet (CVPR 2022), MST++ (CVPRW 2022), etc. (506 stars)