Learnable Subsampling Framework

Updated 16 October 2025

Learnable subsampling is a data-driven method to select measurement indices in compressive sensing, maximizing signal energy retention from training data.
The framework optimizes indices using average-case and worst-case energy maximization with modular and submodular objectives to ensure efficient, near-optimal selection.
Applications in medical imaging, Fourier optics, and sensor networks demonstrate improved reconstruction quality and reduced computational effort compared to traditional methods.

A learnable subsampling framework in the context of compressive sensing refers to a methodology wherein the selection of a subsampling operator—specifically, the index set $\Omega$ used to pick a subset of measurements from orthonormally transformed signals—is itself learned from training data, rather than being chosen at random or via fixed, heuristic-driven schemes. The framework studied in "Learning-based Compressive Subsampling" (Baldassarre et al., 2015) provides a mathematically rigorous, optimization-driven approach for constructing $\Omega$ so that the information (e.g., signal energy) captured by the selected measurements is maximized, thereby enabling more accurate signal recovery from a reduced set of observations, as required by applications in medical imaging, Fourier optics, and related domains.

1. Problem Setting and Subsampling Operator

Compressive sensing seeks to recover a structured (often sparse or compressible) signal $\mathbf{x} \in \mathbb{C}^p$ from a set of linear, dimensionality-reduced observations $\mathbf{b} = \mathbf{A}\mathbf{x}$ . For practical and physical reasons, the measurement matrix $\mathbf{A}$ is often constrained to have the structure: $\mathbf{A} = \mathbf{P}_{\Omega} \boldsymbol{\Psi},$ where $\boldsymbol{\Psi}$ is a $p\times p$ orthonormal basis (such as Fourier, Hadamard, or wavelet transforms) and $\mathbf{P}_\Omega$ is the subsampling operator selecting the $n$ rows indexed by $\Omega$ (with $|\Omega|=n\ll p$ ). Formally, $\mathbf{P}_\Omega$ is an $n\times p$ matrix that zeros out all but the entries specified by $\Omega$ , thus reducing the dimensionality of the measurements. The choice of the index set $\Omega$ is fundamental: it governs which components of $\mathbf{x}$ (in the basis $\boldsymbol{\Psi}$ ) are actually observed and retained during acquisition.

2. Learning-Based Selection of the Index Set

The classical approach to designing $\Omega$ relies on domain knowledge (such as variable-density or non-uniform random sampling tuned for the decay properties of the basis coefficients) or analytic expressions from information or coherence arguments. In contrast, the framework in (Baldassarre et al., 2015) adopts a data-driven formulation:

Given a set of $m$ training signals $\{\mathbf{x}_j\}_{j=1}^m$ , the aim is to find a fixed, deterministic index set $\Omega$ such that the subsampled measurements retain maximal signal energy (or, in general, information) from the training set. Two canonical optimization criteria are considered:

Average-case energy maximization:

$\Omega^* = \underset{|\Omega|=n}{\arg\max} \ \frac{1}{m} \sum_{j=1}^m \| \mathbf{P}_\Omega \boldsymbol{\Psi} \mathbf{x}_j \|_2^2$

Worst-case energy maximization:

$\Omega^* = \underset{|\Omega|=n}{\arg\max} \ \min_{j=1,\dots,m} \ \| \mathbf{P}_\Omega \boldsymbol{\Psi} \mathbf{x}_j \|_2^2$

These objectives, or their variants using concave utility functions of per-sample energy, directly translate to combinatorial optimization problems over possible index sets.

3. Combinatorial Optimization and Modularity

The structure of the energy aggregation functions introduced yields important properties:

The average-case objective is modular (additive) in the index set, allowing optimization by simply sorting all candidate indices $i$ by

$\sum_{j=1}^m |\langle \psi_i, \mathbf{x}_j \rangle|^2$

and selecting the $n$ largest.

More advanced objectives (e.g., those involving concave utilities or group constraints on $\Omega$ ) exhibit submodularity, which enables efficient greedy algorithms with provable approximation guarantees.

In the worst-case scenario, for generic concave utilities, the “Saturate” algorithm—a greedy set covering approach tailored to robust (min-max) objectives—yields near-optimal solutions.

4. Theoretical Guarantees

Both deterministic and statistical generalization guarantees for the learned subsampling pattern are established:

Deterministic generalization: If the selected $\Omega$ ensures all training energies $\| \mathbf{P}_\Omega \boldsymbol{\Psi} \mathbf{x}_j \|_2^2 \geq 1-\delta$ (for normalized signals), then for any signal $\mathbf{x}$ which deviates from the training set only slightly on unselected indices, the captured energy is lower-bounded as

$\| \mathbf{P}_\Omega \boldsymbol{\Psi} \mathbf{x} \|_2^2 \geq 1 - (\sqrt{\delta} + \sqrt{\epsilon})^2,$

where $\epsilon$ relates to the deviation of $\mathbf{x}$ on unobserved indices.

Statistical generalization: When signals are sampled i.i.d. from a fixed distribution, the empirical optimization problem (using $m=O(n\log(p/n))$ training examples) outputs an $\Omega$ for which the expected retained energy on unseen signals is within an arbitrarily small gap of the best possible fixed index set.

5. Empirical Demonstrations

Extensive numerical experiments validate the theoretical claims:

Natural image datasets (Kenya, ImageNet): The learned subsampling patterns using bases such as Hadamard, DCT, or wavelet transforms achieve reconstruction errors and PSNR matching or surpassing popular random variable-density sampling.
iEEG (intracranial EEG) signals: Deterministically chosen $\Omega$ (from the learning-based method) combined with simple linear decoders significantly outperform random subsampling followed by full nonlinear dictionary-based recovery.
MRI: Learned k-space sampling masks constructed by the framework, when used with both linear and basis pursuit reconstructions, yield improved quality over traditional variable-density designs. Notably, for many cases, linear (adjoint-based) decoders are nearly as effective as nonlinear decoders, reducing computational burden.

6. Practical and Domain Applications

The learning-based subsampling methodology is broadly applicable:

Medical Imaging (MRI, CT, etc.): Enables efficient hardware implementations and robust image reconstructions from fewer measurements.
Spectroscopy and Fourier Optics: Supports precise, adaptable acquisition strategies for physically constrained experimentation.
Sensor Networks (e.g., iEEG, implantable sensors): Adapts subsampling patterns to maximize power/resource efficiency while maintaining recovery accuracy.
General image compression: Extends classical transform-domain algorithms (e.g., JPEG, which uses DCT) by learning which basis coefficients are most informative for compression-reconstruction, greatly reducing resource requirements.

7. Summary and Impact

The learnable subsampling framework fundamentally shifts subsampling design from parameter tuning and expert-driven heuristics to algorithmic, data-dependent optimization. Key technical features include modular/submodular optimization objectives, deterministic and statistical guarantees, and empirical superiority to traditional random sampling. The framework's flexibility, efficiency, and foundational mathematical footing enable its adoption in diverse domains requiring dimensionality reduction under strict resource, statistical, or physical constraints.

PDF Markdown Chat (Pro)

References (1)

Learning-based Compressive Subsampling (2015)

Follow Topic

Get notified by email when new papers are published related to Learnable Subsampling Framework.