AMP: Iterative Algorithms for High-Dimensional Inference

Updated 19 November 2025

Approximate Message Passing (AMP) is an iterative algorithm that leverages the Onsager correction and state evolution to predict the performance of high-dimensional sparse recovery tasks.
Rooted in statistical physics and belief propagation, AMP provides scalable solutions for compressed sensing, linear regression, and matrix inference by tracking empirical errors through a scalar recursion.
Extensions such as OAMP, VAMP, and SwAMP enhance AMP's robustness and convergence for ill-conditioned, structured, or non-i.i.d. matrices, broadening its applicability in advanced inference.

Approximate Message Passing (AMP) is a class of iterative algorithms for high-dimensional inference, prominently used in linear regression, compressed sensing, statistical physics, and related areas. AMP emerges as an analytically tractable and computationally efficient scheme that, in the large-system limit and under certain model conditions, exhibits dynamics precisely tracked by so-called "state evolution" recursions. These properties make AMP valuable for both theoretical understanding and practical signal recovery in high dimensions (0911.4219, Ma et al., 2017).

1. Foundations and Derivation

AMP is rooted in the analysis of dense graphical models, especially sum-product belief propagation (BP) on high-dimensional linear systems,

$y = A \beta_0 + w,$

where $y \in \mathbb{R}^n$ , $A \in \mathbb{R}^{n \times p}$ (typically i.i.d. $\mathcal{N}(0,1/n)$ ), $\beta_0 \in \mathbb{R}^p$ is the signal, and $w \in \mathbb{R}^n$ is noise. AMP's iteration, discussed initially for compressed sensing (0911.4219), is: $\begin{aligned} r^t &= y - A \beta^t + \frac{1}{\delta} r^{t-1} \langle \eta'^{t-1}(A^T r^{t-1} + \beta^{t-1}) \rangle, \ \beta^{t+1} &= \eta^t(A^T r^t + \beta^t), \end{aligned}$ where $\delta = n/p$ , $\eta^t$ is a (possibly nonlinear) "denoiser" tailored to the prior on $\beta_0$ , and $\langle \cdot \rangle$ indicates empirical mean. The Onsager (memory) term in the residual update is essential: it corrects for correlations that arise due to the dense factor graph structure, ensuring that the effective observations at each AMP step remain asymptotically Gaussian and uncorrelated (0911.4219).

The derivation can be viewed as a TAP (Thouless–Anderson–Palmer) expansion in the statistical mechanics literature, with the Onsager reaction term providing self-consistency (0911.4219).

2. State Evolution and Performance Prediction

A defining property of AMP is that, for large i.i.d. sub-Gaussian $A$ , the evolution of any pseudo-Lipschitz functional of the iterates, including the empirical mean squared error (MSE), is exactly tracked by a scalar recursion dubbed "state evolution" (SE) (0911.4219, Ma et al., 2017, Rush et al., 2016). For a coordinate-separable denoiser ( $\eta^t(x) = (\eta^t(x_1),\ldots,\eta^t(x_p))$ ), SE takes the form: $\begin{aligned} \tau^0{}^2 &= \sigma^2 + \frac{\mathbb{E}[\beta^2]}{\delta}, \ \tau^{t+1}{}^2 &= \sigma^2 + \frac{1}{\delta} \mathbb{E}[(\eta^t(\beta + \tau^t Z) - \beta)^2], \end{aligned}$ with $\beta \sim p_\beta$ (the prior), $Z \sim \mathcal{N}(0,1)$ . This recursion predicts the per-iteration MSE and allows exact performance characterization of AMP in the high-dimensional limit (0911.4219, Rush et al., 2016).

Rigorous finite-sample analyses provide exponential concentration of the empirical error about the SE prediction up to iteration $t = O(\log n/\log \log n)$ (Rush et al., 2016).

3. Denoiser Structure: Separable and Non-Separable Variants

AMP classically employs separable denoisers, appropriate when the signal prior factorizes over coordinates. However, real signals often exhibit dependencies (e.g., Markov, MRF, or hierarchical structures). Extensions to non-separable and sliding-window denoisers have been introduced, allowing AMP to exploit local dependencies and structured priors:

Sliding-window denoisers: At each coordinate, $\eta^t$ may depend on a window of $2k+1$ neighboring entries. For signals with local statistical dependencies, e.g., images or binary Markov chains, sliding-window denoisers enable substantial error reduction, and the SE recursion is accordingly adapted by integrating over local block distributions (Ma et al., 2017, Ma et al., 2019).
Hierarchical/deep denoisers: The denoiser may incorporate latent-variable models, such as Restricted Boltzmann Machines, enabling AMP to capture complex support correlations in compressed sensing (Tramel et al., 2015).
Block-separable and matrix/tensor AMP: For structured estimation problems (e.g., matrix and tensor decomposition), AMP updates, denoisers, and state evolution can be formulated in block or matrix terms, still closely paralleling the scalar case (Tan et al., 2023, Rossetti et al., 2023).

AMP continues to admit rigorous SE analysis for many of these cases, provided the model satisfies appropriate conditions (e.g., local dependence, Dobrushin uniqueness, Lipschitz denoisers) (Ma et al., 2017, Ma et al., 2019).

4. Robustness, Generalizations, and Algorithmic Extensions

While SE is exact for dense i.i.d. $A$ , AMP is known to diverge or become inaccurate for "difficult" matrices: those with ill-conditioned spectra, non-zero mean, or correlated columns (Guo et al., 2015, Rangan et al., 2016). Multiple algorithmic innovations have been developed to address this:

Orthogonal (OAMP), Vector (VAMP), and Unitarily Invariant AMP: These variants enforce explicit orthogonality or use Gram–Schmidt (GS) principles to decorrelate errors, restoring state evolution and convergence for $A$ drawn from right-orthogonally invariant ensembles. VAMP and OAMP match the Bayes-optimal performance (predicted by the replica method) whenever the fixed point is unique (Rangan et al., 2016, Liu et al., 2022).
UT-AMP: Applies a unitary transformation (e.g., SVD, DFT) to the linear system, maintaining convergence guarantees for any $A$ under Gaussian priors (Guo et al., 2015).
Memory AMP (MAMP): Enforces full orthogonality for estimation errors by using long-memory matched filters and vector damping, lowering complexity compared to OAMP/VAMP, and recovering Bayes-optimality across all right-unitarily-invariant matrices (Liu et al., 2020, Liu et al., 2021).
Swept AMP (SwAMP): Uses sequential ("swept") coefficient-wise updates instead of parallel updates, improving robustness and convergence for non-i.i.d. or structured sensing matrices (Manoel et al., 2014).
Generalized and Non-Symmetric AMP: Recent advances extend AMP analysis to non-symmetric random matrices, sparse ensembles, and matrices with general variance/covariance structures. The Onsager term and SE are generalized using modes accommodating local variance/correlation profiles (Gueddari et al., 26 Mar 2025).

AMP is thereby applicable in a broad range of contexts, from standard compressed sensing and statistical regression to inference over graphical, block, or tensor structures.

5. Universality and Theoretical Guarantees

The universality phenomenon—AMP's SE and performance holding beyond the classical i.i.d. regime—has been rigorously established under broader assumptions:

Semi-random/rotationally invariant ensembles: AMP state evolution applies to matrices whose singular vectors are generic/delocalized (e.g., partial Fourier, signed Sine, random orthogonal), provided sufficient randomness in eigenstructure (Dudeja et al., 2022).
Non-symmetric matrices: Enhanced combinatorial/probabilistic techniques allow density evolution analyses for AMP with general non-symmetric, possibly sparse, or correlated random matrices (Gueddari et al., 26 Mar 2025).

The primary technical tools include combinatorial expansions (non-backtracking trees), conditioning on Gaussian measures, and moment method arguments for showing Gaussianity and decoupling of errors (0911.4219, Ma et al., 2017, Gueddari et al., 26 Mar 2025).

6. Applications and Algorithmic Ecosystem

AMP and its variants support a wide array of applications:

Compressed sensing and sparse estimation: AMP achieves state-of-the-art phase transitions and MMSE recovery for sparse signals, often exceeding standard iterative thresholding, $\ell_1$ minimization, and EM approaches (0911.4219, Mondelli et al., 2020, Tramel et al., 2015).
Statistical physics models: AMP algorithms solve mean-field and TAP equations in spin glasses, inference in stochastic block models, and dynamics of complex random systems (0911.4219, Dudeja et al., 2022).
Multi-processor and distributed inference: Specialized AMP variants operate in row- and column-partitioned data settings, with state evolution characterizing their behavior under both lossless and lossy message-passing architectures (Zhu et al., 2017).
Matrix and tensor inference: AMP-based algorithms provide efficient, SE-tracked solutions to matrix/tensor decomposition, mixed regression, and contextual clustering problems under both symmetric and non-symmetric observation models (Tan et al., 2023, Rossetti et al., 2023).
Noisy linear models (GLMs): GAMP extends AMP to generalized linear models, supporting phase retrieval and logistic regression with precise asymptotic characterizations (Mondelli et al., 2020).

Recent work demonstrates that the entire class of low-degree AMP algorithms can be efficiently simulated by semidefinite programs (SDPs) under principled hierarchies (local statistics, SoS), even in the presence of certain adversarial perturbations (Ivkov et al., 2023).

7. Outlook and Future Directions

AMP's analytical tractability and empirical performance underpin its ongoing relevance in high-dimensional inference and statistical learning. Key directions include:

Extensions to deeper, learned, or adaptive denoisers: Incorporating neural and deep generative priors for signal models beyond hand-specified distributions (Tramel et al., 2015).
Dynamical and non-stationary models: Developing state evolution for time-varying or "online" AMP variants (Gueddari et al., 26 Mar 2025).
Statistical–computational gaps and phase transitions: Precisely delineating where AMP achieves Bayes-optimality and where algorithmic inapproximability arises.
Universality and robustness: Expanding rigorous theory to further classes and practical randomness/non-randomness in high-dimensional designs (Dudeja et al., 2022, Gueddari et al., 26 Mar 2025).
Robustness via convex relaxations: Bridging AMP and SDP/information-theoretic algorithms for robust, certifiable reconstruction under adversarial perturbations (Ivkov et al., 2023).

AMP thus forms a central thread through contemporary research in high-dimensional inference, blending insights from probability, statistical physics, convex optimization, and machine learning (0911.4219, Ma et al., 2017, Rush et al., 2016, Rangan et al., 2016, Liu et al., 2022, Gueddari et al., 26 Mar 2025, Ivkov et al., 2023).