MOSAIC: Masked Optimisation with Selective Attention for Image Reconstruction (2306.00906v1)

Published 1 Jun 2023 in cs.CV and eess.IV

Abstract: Compressive sensing (CS) reconstructs images from sub-Nyquist measurements by solving a sparsity-regularized inverse problem. Traditional CS solvers use iterative optimizers with hand crafted sparsifiers, while early data-driven methods directly learn an inverse mapping from the low-dimensional measurement space to the original image space. The latter outperforms the former, but is restrictive to a pre-defined measurement domain. More recent, deep unrolling methods combine traditional proximal gradient methods and data-driven approaches to iteratively refine an image approximation. To achieve higher accuracy, it has also been suggested to learn both the sampling matrix, and the choice of measurement vectors adaptively. Contrary to the current trend, in this work we hypothesize that a general inverse mapping from a random set of compressed measurements to the image domain exists for a given measurement basis, and can be learned. Such a model is single-shot, non-restrictive and does not parametrize the sampling process. To this end, we propose MOSAIC, a novel compressive sensing framework to reconstruct images given any random selection of measurements, sampled using a fixed basis. Motivated by the uneven distribution of information across measurements, MOSAIC incorporates an embedding technique to efficiently apply attention mechanisms on an encoded sequence of measurements, while dispensing the need to use unrolled deep networks. A range of experiments validate our proposed architecture as a promising alternative for existing CS reconstruction methods, by achieving the state-of-the-art for metrics of reconstruction accuracy on standard datasets.

Summary

The paper presents MOSAIC, which uses masked optimization and selective attention to accurately reconstruct images from compressed measurements.
It transforms the CS problem into a masked-learning task that employs a fixed measurement basis, enhancing reconstruction efficiency.
Experimental results show MOSAIC outperforms traditional methods by achieving high PSNR and SSIM without adaptive sampling matrices.

MOSAIC: Advancements in Compressive Sensing via Masked Optimization and Selective Attention

The paper "MOSAIC: Masked Optimisation with Selective Attention for Image Reconstruction" presents an innovative approach to the longstanding challenge of image recovery from compressed measurements, a key problem in compressive sensing (CS). This paper by Somarathne et al. introduces MOSAIC, an attention-based framework that seeks to enhance the accuracy and efficiency of image reconstruction from sub-Nyquist sampled data.

Overview of Compressive Sensing

Compressive sensing (CS) is employed to recover signals from a limited number of measurements, often fewer than dictated by the Nyquist sampling theorem. The essence of CS lies in solving an inverse problem, typically with a sparsity prior, to accurately reconstruct the original signal. Traditional CS approaches entail iterative optimization with hand-crafted regularizers, while data-driven methods aim to directly map measurements to signal space, albeit often limited to specific sampling matrices.

Emerging techniques, combining iterative optimization and data-driven learning, such as deep unrolling methods, advocate for the adaptive learning of sampling matrices and measurement vectors. In contrast, the MOSAIC framework posits a departure from this trend by hypothesizing that a general inverse mapping is possible across random compressed measurements if the measurement basis is fixed.

Proposed Method: MOSAIC Framework

MOSAIC introduces a novel approach that employs a fixed measurement basis and explores the selective attention mechanism for reconstructing images from arbitrary compressed measurements. Central to its methodology is the transformation of CS reconstruction into a 'measurement filling' task akin to masked-learning strategies inspired by models like BERT in natural language processing.

The MOSAIC architecture departs from deep unrolling and instead utilizes attention mechanisms on encoded sequences of measurements. This process enables the model to focus on the most informative measurements selectively. The architecture's efficiency is bolstered by an embedding technique that embeds the measurement information before applying the attention mechanism.

Experimental Success and Implications

This paper showcases significant experimental validation across standard datasets, demonstrating that MOSAIC achieves state-of-the-art performance metrics in terms of PSNR and SSIM when compared to existing methods. Notably, MOSAIC sustains robust performance even without relying on learned sampling matrices, challenging the prevailing notion that sampling matrices must be adaptive or data-driven.

Moreover, the MOSAIC framework contributes to the CS field by reinforcing the potential of attention mechanisms tailored for CS-specific tasks and by effectively leveraging a fixed measurement basis for general image reconstruction tasks. This paradigm could simplify implementation in practical scenarios where adaptive modification of the measurement process might not be feasible.

Future Directions

The implications of this research extend beyond immediate results, suggesting several future research avenues. The prospect of integrating different types of fixed bases and extending MOSAIC capabilities to handle various noise models presents viable opportunities. Additionally, investigating the combination of such attention mechanisms with other emerging paradigms in CS could lead to more holistic and robust imaging solutions.

Further exploration into this attention-based reconstruction could also pave the way for advancements in related fields, such as medical imaging, remote sensing, and any domain where information is often acquired in sparse or compressed forms.

In conclusion, the MOSAIC framework demonstrably shifts the focus in compressive sensing from parameter-learning of measurement processes towards optimizing reconstruction pathways via attention mechanisms—a perspective with promising theoretical and practical ramifications.

Related Papers

YouTube

Show All Videos