QMambaBSR: Burst Image Super-Resolution with Query State Space Model (2408.08665v2)

Published 16 Aug 2024 in cs.CV

Abstract: Burst super-resolution aims to reconstruct high-resolution images with higher quality and richer details by fusing the sub-pixel information from multiple burst low-resolution frames. In BusrtSR, the key challenge lies in extracting the base frame's content complementary sub-pixel details while simultaneously suppressing high-frequency noise disturbance. Existing methods attempt to extract sub-pixels by modeling inter-frame relationships frame by frame while overlooking the mutual correlations among multi-current frames and neglecting the intra-frame interactions, leading to inaccurate and noisy sub-pixels for base frame super-resolution. Further, existing methods mainly employ static upsampling with fixed parameters to improve spatial resolution for all scenes, failing to perceive the sub-pixel distribution difference across multiple frames and cannot balance the fusion weights of different frames, resulting in over-smoothed details and artifacts. To address these limitations, we introduce a novel Query Mamba Burst Super-Resolution (QMambaBSR) network, which incorporates a Query State Space Model (QSSM) and Adaptive Up-sampling module (AdaUp). Specifically, based on the observation that sub-pixels have consistent spatial distribution while random noise is inconsistently distributed, a novel QSSM is proposed to efficiently extract sub-pixels through inter-frame querying and intra-frame scanning while mitigating noise interference in a single step. Moreover, AdaUp is designed to dynamically adjust the upsampling kernel based on the spatial distribution of multi-frame sub-pixel information in the different burst scenes, thereby facilitating the reconstruction of the spatial arrangement of high-resolution details. Extensive experiments on four popular synthetic and real-world benchmarks demonstrate that our method achieves a new state-of-the-art performance.

Citations (3)

View on Semantic Scholar

Summary

The paper presents QMambaBSR, which utilizes a novel Query State Space Model to efficiently extract sub-pixel details and suppress noise in burst frames.
It employs an Adaptive Up-sampling module that dynamically adjusts kernels to preserve texture and prevent artifacts during image reconstruction.
Experimental results demonstrate state-of-the-art PSNR and SSIM improvements on both synthetic and real-world datasets, confirming its practical impact.

QMambaBSR: Burst Image Super-Resolution with Query State Space Model

The paper introduces QMambaBSR, a novel network for Burst Image Super-Resolution (BurstSR), addressing the limitations of existing methods in reconstructing high-resolution images from multiple low-resolution frames. Traditional BurstSR techniques encounter difficulties in effectively extracting sub-pixel details and suppressing high-frequency noise, largely due to their reliance on frame-by-frame analysis and static upsampling techniques. QMambaBSR innovatively incorporates a Query State Space Model (QSSM) and Adaptive Up-sampling module (AdaUp) to overcome these challenges, facilitating superior image reconstruction quality.

Key Innovations

Query State Space Model (QSSM): QSSM is designed to efficiently extract sub-pixel information through inter-frame querying and intra-frame scanning. By leveraging the consistent spatial distribution of sub-pixels and the inconsistent distribution of random noise, QSSM enhances the extraction process, reducing noise interference in a single step. This model enables simultaneous querying of all current frames by the base frame, thereby improving the interaction between frames and enabling more accurate sub-pixel extraction.
Adaptive Up-sampling (AdaUp): AdaUp dynamically adjusts the upsampling kernel to align with the spatial distribution of sub-pixel information across different scenes. This adaptability ensures better reconstruction of high-resolution image details compared to static upsampling methods. By perceiving how sub-pixels are distributed in the feature space, AdaUp avoids the over-smoothing of details and the introduction of artifacts commonly seen in previous methods.
Multi-scale Fusion Module: This module is designed to fuse sub-pixel information across different scales and incorporate both local and global features. It combines a Convolutional Neural Network (CNN), a State Space Model (SSM) with multiple scanning directions, and a channel Transformer. This combination enhances the network’s ability to integrate sub-pixel information, leading to more comprehensive reconstruction of image details.

Experimental Results

The proposed QMambaBSR achieves state-of-the-art performance across four popular BurstSR benchmarks, both synthetic and real-world: Synthetic BurstSR, Real BurstSR, RealBSR-RAW, and RealBSR-RGB. Extensive experiments demonstrate:

Superior Quantitative Performance: QMambaBSR consistently outperforms existing methods, achieving significant improvements in PSNR and SSIM metrics. For instance, it achieved a PSNR improvement of 0.29 dB compared to the state-of-the-art method Burstormer on the Synthetic BurstSR dataset and similar enhancements on real-world datasets.
Enhanced Visual Quality: The network demonstrates superior reconstruction of textures and detailed image regions, effectively mitigating artifacts and noise. The visual comparisons indicate that QMambaBSR can reconstruct finer details and produce visually pleasing high-resolution images.

Implications and Future Directions

The introduction of QSSM and AdaUp in burst image processing holds significant implications for both practical applications and theoretical advancements:

Practical Applications: High-resolution image reconstruction is critical in fields such as mobile photography, satellite imaging, and medical imaging. The enhanced quality of images produced by QMambaBSR can significantly improve the usability and reliability of visual data in these domains.
Theoretical Advancements: The integration of state space models with query mechanisms presents a novel approach to handling multi-frame data. This methodology can be extended to other areas of computer vision where multi-frame data needs to be processed, such as video super-resolution and multi-view reconstruction.

Future developments in AI could build upon the principles established by QMambaBSR. Further exploration into more efficient and adaptive fusion and upsampling techniques can lead to even more robust image reconstruction methods. Additionally, extending this approach to real-time applications and exploring its integration with hardware accelerators can help in achieving faster and more energy-efficient implementations.

Conclusion

QMambaBSR introduces significant advancements in Burst Image Super-Resolution by addressing the critical challenges of sub-pixel extraction and adaptive upsampling. The innovative use of QSSM and AdaUp modules leads to superior performance both quantitatively and qualitatively. This research not only sets a new benchmark in burst image processing but also opens up new avenues for advanced multi-frame data handling techniques.