Linear Compressed Sensing

Updated 4 July 2026

Linear compressed sensing is a technique for recovering structured signals from underdetermined linear measurements by exploiting low-complexity properties such as sparsity and compressibility.
It employs methods like ℓ1-minimization, weighted convex programs, and probabilistic inference to solve inverse problems even when the number of measurements is far less than the ambient signal dimension.
The approach is applicable to various scenarios including imaging, quantized and one-bit sensing, and deterministic constructions, offering robust recovery under diverse noise and structural assumptions.

Linear compressed sensing concerns the recovery of a structured signal from an underdetermined collection of linear measurements, typically written as $y=Ax$ or $y=Ax+n$ with $A\in\mathbb{R}^{m\times n}$ and $m<n$ . In the literature, the defining linearity lies in the sensing map itself: measurements are linear projections of the unknown, while reconstruction may proceed by $\ell_1$ -minimization, weighted convex programs, row-action methods, probabilistic inference, compression-code projections, or learned decoders. Across these formulations, the central problem is to exploit low-complexity structure—sparsity, compressibility, total variation, support priors, nonnegativity, or source distributions—to invert a linear system from fewer observations than ambient dimension would ordinarily require (Arildsen et al., 2013, Díaz et al., 2016, Beygi et al., 2017).

1. Measurement model and structural assumptions

The standard finite-dimensional model is

$y=Ax+n,$

with $x\in\mathbb{R}^n$ , $y\in\mathbb{R}^m$ , $A\in\mathbb{R}^{m\times n}$ , and $m\ll n$ . A common refinement writes $y=Ax+n$ 0, where $y=Ax+n$ 1 is sparse or compressible in a basis or dictionary $y=Ax+n$ 2, giving $y=Ax+n$ 3. Several papers specialize to i.i.d. Gaussian sensing matrices, for example with entries distributed as $y=Ax+n$ 4 or $y=Ax+n$ 5, while others assume more general isotropic subgaussian rows or even deterministic zero-one matrices viewed as parity-check matrices (Arildsen et al., 2013, Coluccia et al., 2013, Jung et al., 2019, Dimakis et al., 2010).

The structural model need not be classical coordinate sparsity. Nonnegative compressed sensing assumes $y=Ax+n$ 6 and exploits maximally-skewed stable random projections; quantized compressed sensing places $y=Ax+n$ 7 in a convex low-complexity set $y=Ax+n$ 8; compression-based formulations regard $y=Ax+n$ 9 as belonging to a class $A\in\mathbb{R}^{m\times n}$ 0 equipped with a lossy compression code; and Bayesian asymptotic analyses model $A\in\mathbb{R}^{m\times n}$ 1 as an i.i.d. process or as a stationary stochastic process with a rate-distortion function and information dimension (Li et al., 2013, Jung et al., 2019, Rezagah et al., 2016, Wu et al., 2011).

Linear compressed sensing also extends naturally to structured acquisition geometries. In progressive imaging, each row of an image can be measured as

$A\in\mathbb{R}^{m\times n}$ 2

with independent Gaussian row-wise sensing matrices, and in block-based image acquisition each block $A\in\mathbb{R}^{m\times n}$ 3 is measured by

$A\in\mathbb{R}^{m\times n}$ 4

Line-based acquisition further reuses the same measurement operator on every line, preserving linearity while changing memory and implementation characteristics (Coluccia et al., 2013, Adler et al., 2016, Ebrahim et al., 2019).

This suggests that “linear compressed sensing” is best understood not as a single sparsity model, but as a family of linear inverse problems whose differences are encoded in the admissible signal class and the decoder.

2. Convex recovery, geometry, and phase transitions

The canonical decoder is basis pursuit,

$A\in\mathbb{R}^{m\times n}$ 5

or its noisy analogue, Basis Pursuit De-Noising,

$A\in\mathbb{R}^{m\times n}$ 6

Weighted variants replace $A\in\mathbb{R}^{m\times n}$ 7 by $A\in\mathbb{R}^{m\times n}$ 8, and in some settings a quadratic term is added, as in

$A\in\mathbb{R}^{m\times n}$ 9

These formulations underlie a large portion of the linear-CS literature because they combine convexity with strong sparsity-promoting behavior (Arildsen et al., 2013, Díaz et al., 2016, Lorenz et al., 2014).

A geometric description of exact recovery is given in terms of descent cones. For a convex function $m<n$ 0 and signal $m<n$ 1, recovery succeeds precisely when the descent cone $m<n$ 2 intersects $m<n$ 3 only at the origin. With Gaussian $m<n$ 4, the corresponding phase transition is controlled by the statistical dimension $m<n$ 5. In the weighted $m<n$ 6 setting, this leads to a principled design rule: choose weights to reduce the expected statistical dimension of the random descent cone induced by the support distribution. The resulting weighted decoder lowers the phase-transition threshold when support probabilities are nonuniform and sufficiently concentrated (Díaz et al., 2016).

A complementary asymptotic perspective appears in the Bayesian theory of i.i.d. discrete-continuous mixtures,

$m<n$ 7

For this model, the information dimension is $m<n$ 8, and the optimal measurement threshold for linear compressed sensing is likewise $m<n$ 9. In the noisy setting, the same quantity separates stable recovery from instability: above the threshold, normalized MSE remains proportional to $\ell_1$ 0; below it, the noise sensitivity diverges. A central conclusion is that Gaussian random linear encoders incur no phase-transition penalty relative to optimal nonlinear encoding for this source class (Wu et al., 2011).

The nullspace-property formulation gives another exact criterion. For zero-one measurement matrices $\ell_1$ 1, strict $\ell_1$ 2 guarantees that basis pursuit returns the same estimate as the $\ell_1$ 3 program for all $\ell_1$ 4-sparse vectors. This perspective becomes especially important in deterministic constructions derived from coding theory (Dimakis et al., 2010).

3. Distribution-aware, nonnegative, and compression-based models

Several strands of linear compressed sensing move beyond generic sparsity by incorporating explicit prior information. Weighted $\ell_1$ 5-minimization for signals drawn from a known distribution $\ell_1$ 6 uses marginal support probabilities

$\ell_1$ 7

to select coordinate weights. The design objective is the expected statistical dimension

$\ell_1$ 8

and the analysis is backed by intrinsic volumes and discrete-geometric Monte Carlo estimators (Díaz et al., 2016).

For nonnegative signals, compressed counting replaces Gaussian or Bernoulli matrices by maximally-skewed stable projections,

$\ell_1$ 9

and uses the coordinatewise minimum estimator

$y=Ax+n,$ 0

When $y=Ax+n,$ 1, the paper gives the measurement scaling

$y=Ax+n,$ 2

more precisely $y=Ax+n,$ 3, with $y=Ax+n,$ 4 as $y=Ax+n,$ 5 and $y=Ax+n,$ 6 when $y=Ax+n,$ 7. In the exact sparse limit $y=Ax+n,$ 8, this becomes essentially $y=Ax+n,$ 9, together with one-pass coordinatewise recovery (Li et al., 2013).

Compression-based compressed sensing treats the reconstruction set as the codebook of a lossy compressor. In the stochastic formulation, Compressible Signal Pursuit searches over a codebook $x\in\mathbb{R}^n$ 0 and, in the low-distortion regime, recovers $x\in\mathbb{R}^n$ 1-length source blocks from slightly more than $x\in\mathbb{R}^n$ 2 times the rate-distortion dimension of the source. Under regularity conditions, that rate-distortion dimension equals the information dimension, linking compression-based recovery to information-theoretic limits (Rezagah et al., 2016). The efficient descendant of this idea is compression-based gradient descent,

$x\in\mathbb{R}^n$ 3

implemented in practice by compressing and decompressing the gradient step. For Gaussian and subgaussian sensing, the method converges linearly to a neighborhood whose size is controlled by compression distortion and noise, and experiments using JPEG2000 yield state-of-the-art imaging performance (Beygi et al., 2017).

A different form of prior adaptation appears in learned sensing matrices. The $x\in\mathbb{R}^n$ 4-AE framework keeps the encoder linear,

$x\in\mathbb{R}^n$ 5

but learns $x\in\mathbb{R}^n$ 6 by unrolling projected subgradient steps for the $x\in\mathbb{R}^n$ 7 decoder during training. The learned matrix can then be deployed with the original convex decoder

$x\in\mathbb{R}^n$ 8

On structured sparse datasets, this reduces the number of measurements needed for high-quality recovery by a factor of $x\in\mathbb{R}^n$ 9– $y\in\mathbb{R}^m$ 0 compared to previous state-of-the-art methods, while the no-extra-structure synthetic case shows little difference from Gaussian sensing (Wu et al., 2018).

4. Correlated noise, quantization, one-bit sensing, and linearization

A recurrent theme is that departures from ideal measurement assumptions can often be absorbed into modified linear models. When additive measurement noise is linearly correlated with the noiseless measurements, the observation model becomes

$y\in\mathbb{R}^m$ 1

equivalently

$y\in\mathbb{R}^m$ 2

Ordinary BPDN then incurs an amplitude bias. A simple correction is to solve standard BPDN and rescale the recovered coefficients by $y\in\mathbb{R}^m$ 3. In low-rate quantization experiments, this reduced reconstruction error by up to approximately $y\in\mathbb{R}^m$ 4 dB for $y\in\mathbb{R}^m$ 5 bit/sample Lloyd–Max quantization, and it outperformed BIHT when the number of nonzeros exceeded approximately one tenth of the number of measurements (Arildsen et al., 2013).

One-bit compressed sensing takes the extreme case

$y\in\mathbb{R}^m$ 6

where magnitude is irrecoverable and only direction can be estimated. A central result shows that an $y\in\mathbb{R}^m$ 7-sparse vector can be accurately recovered from the signs of $y\in\mathbb{R}^m$ 8 Gaussian linear measurements by a linear program, uniformly over all effectively sparse signals satisfying $y\in\mathbb{R}^m$ 9. The geometry is governed by random hyperplane tessellations of $A\in\mathbb{R}^{m\times n}$ 0 (Plan et al., 2011).

For general quantized compressed sensing, the ReLU-based convex programs

$A\in\mathbb{R}^{m\times n}$ 1

combine dithered quantization with a convex low-complexity set $A\in\mathbb{R}^{m\times n}$ 2. In the one-bit case, the method is robust to adversarial bit corruptions and additive pre-quantization noise; in the multi-bit case it exhibits a clear transition between an above-resolution regime, where recovery behaves like ordinary noisy linear CS, and a below-resolution regime, where the rate-distortion law becomes one-bit-like (Jung et al., 2019).

Phase-only compressed sensing goes further by observing only

$A\in\mathbb{R}^{m\times n}$ 3

for a complex Gaussian matrix $A\in\mathbb{R}^{m\times n}$ 4. Two recent works show that the problem can be reformulated exactly as a linear compressed sensing program

$A\in\mathbb{R}^{m\times n}$ 5

and then solved by basis pursuit-type methods. One establishes uniform instance optimality and robustness to dense disturbances and sparse corruptions after linearization; the other derives asymptotically precise phase transitions for sparse vectors and low-rank matrices, proving that phase-only measurements can require fewer samples than traditional linear compressed sensing. For a $A\in\mathbb{R}^{m\times n}$ 6-sparse signal in sufficiently large dimension, the phase-only threshold is approximately $A\in\mathbb{R}^{m\times n}$ 7 of the linear-CS threshold, disproving the earlier conjecture that the two transitions coincide (Chen et al., 2024, Chen et al., 21 Jan 2025).

5. Algorithms, online solvers, and scalable acquisition

Linear compressed sensing has also evolved through algorithmic frameworks designed for sequential data, large dimensions, or hardware constraints. Sparse Kaczmarz and linearized Bregman methods solve

$A\in\mathbb{R}^{m\times n}$ 8

by row or block actions. In the single-row sparse Kaczmarz iteration,

$A\in\mathbb{R}^{m\times n}$ 9

soft-thresholding is built directly into the projection geometry. The block limit recovers linearized Bregman, and the online setting allows one to update the estimate as new measurements arrive, using residual jumps as a stopping criterion for acquisition (Lorenz et al., 2014).

Relaxed belief propagation offers a probabilistic alternative. For the linear mixing model

$m\ll n$ 0

it replaces full message distributions by means and variances and tracks asymptotic performance through state evolution. The method extends beyond AWGN output channels and applies to bounded-noise compressed sensing while preserving the same asymptotic large sparse limit behavior as standard BP (Rangan, 2010).

At the heuristic end of the spectrum, the alternating $m\ll n$ 1 method starts from basis pursuit and alternates between threshold-based support updates and constrained $m\ll n$ 2-minimization. In the reported noiseless experiments with $m\ll n$ 3 and $m\ll n$ 4, it achieved higher exact support recovery rates than reweighted $m\ll n$ 5 and IRLS (Chretien, 2008).

Scalable acquisition architectures adapt the linear measurement operator to image geometry. Progressive row-wise compressed sensing with linear prediction reconstructs each row from its own measurements and a predictor formed from adjacent rows; because the measurement residual remains linear,

$m\ll n$ 6

the residual itself can be recovered by standard $m\ll n$ 7-based inversion. The paper reports a reduction from $m\ll n$ 8 complexity for full-image CS to $m\ll n$ 9 for the iterative row-wise method (Coluccia et al., 2013).

For low-power imaging, line-based compressed sensing senses each image line with the same operator,

$y=Ax+n$ 00

and reconstructs the image with TV-AL3. Reported PSNR gains over a conventional CS baseline are about $y=Ax+n$ 01– $y=Ax+n$ 02 dB at sampling rates $y=Ax+n$ 03 to $y=Ax+n$ 04, together with reduced encoder complexity (Ebrahim et al., 2019). In block-based sensing, a fully connected network can implement both the blockwise linear encoder and a nonlinear decoder; at $y=Ax+n$ 05 sensing rate the reported average PSNR gain is $y=Ax+n$ 06 dB and computation time is over $y=Ax+n$ 07-times faster than prior block-CS methods (Adler et al., 2016).

6. Deterministic constructions, information-theoretic limits, and recurring debates

One major line of work asks whether random Gaussian matrices are essential. A rigorous bridge between channel-coding LP decoding and basis pursuit shows that if a zero-one measurement matrix is viewed simultaneously as a real sensing matrix and as a binary parity-check matrix, then pseudo-codeword guarantees for the code imply nullspace-property guarantees for compressed sensing. High-girth LDPC parity-check matrices constructed by Gallager thereby become deterministic compressed sensing matrices with an order-optimal number of rows in the linear sparsity regime (Dimakis et al., 2010).

Another debate concerns how much prior information can reduce sampling. For arbitrary sparse vectors in the universal setting, where the sensing matrix is independent of the sparse basis, the sampling rate-distortion function satisfies

$y=Ax+n$ 08

so allowing approximate support recovery does not reduce the asymptotic measurement rate. For random sparse vectors with known distribution, reductions are possible in some cases—most strikingly for discrete-valued nonzeros, where one sample suffices in the noiseless asymptotic model—but in many continuous-distribution cases no reduction is possible (Reeves et al., 2010).

By contrast, the Bayesian phase-transition theory of discrete-continuous mixtures shows that the fundamental linear measurement threshold is the information dimension $y=Ax+n$ 09, and that random Gaussian linear sensing is already optimal at the threshold level relative to optimal nonlinear encoding (Wu et al., 2011). This sharply separates two questions that are sometimes conflated: whether more prior information helps, and whether randomness in the sensing matrix is itself wasteful.

A plausible implication is that “optimality” in linear compressed sensing depends on what is being held fixed: universality of the matrix, source model, distortion criterion, and computational class of the decoder. Deterministic constructions, compression-based decoders, learned measurement matrices, and linearized nonlinear measurement models all modify one of these axes without changing the central object of study—a structured signal observed through linear projections.