Diagonal-Plus-Low-Rank (DPLR) Structure

Updated 2 May 2026

DPLR structure is a matrix decomposition that represents a matrix as the sum of a diagonal and a low-rank matrix, enabling efficient computations.
It underpins various applications such as factor analysis, covariance estimation, and recommendation systems by linking geometric and optimization formulations.
Fast algorithms like MTFA and sketching methods leverage DPLR properties to dramatically reduce computational complexity in large-scale problems.

A Diagonal-Plus-Low-Rank (DPLR) structure refers to matrix models, algorithms, and computational frameworks in which a target matrix is structured or approximated as the sum of a diagonal matrix and a low-rank matrix. This decomposition arises in a variety of domains, including matrix approximation, large-scale inference, factor analysis, covariance estimation, scientific computing, recommender systems, and fast numerical linear algebra. The DPLR structure underpins both statistical modeling—e.g., representing factor-plus-uniqueness models—and computational efficiency by leveraging fast matrix operations and convex relaxations. Modern research has established rigorous equivalence between DPLR decomposition, the facial structure of the elliptope, ellipsoid fitting, and optimization-based recovery schemes, with broad algorithmic and theoretical implications (Saunderson et al., 2012, Yeon et al., 18 Dec 2025, Fernandez et al., 28 Sep 2025, Wu et al., 2018).

1. Mathematical Definition and Modeling Contexts

Given $X\in\mathbb{R}^{n\times n}$ (typically symmetric), a DPLR decomposition is

$X = D + L$

where $D\in\mathbb{R}^{n\times n}$ is diagonal and $L\succeq 0$ is (preferably) low-rank and positive semidefinite. For non-symmetric cases, $L$ may be general low-rank. In variant settings:

$L=U U^T$ , $U\in\mathbb{R}^{n\times r}$ , with $r\ll n$ encodes the low-rank factor.
Infield-wise modeling, such as factorization machines, $W = D + U \Sigma U^T$ with $\Sigma$ diagonal and $X = D + L$ 0 (Shtoff et al., 2024).
For operator-theoretic and sketching settings, $X = D + L$ 1 with $X = D + L$ 2, $X = D + L$ 3 diagonal, and $X = D + L$ 4 low-rank (Fernandez et al., 28 Sep 2025).

DPLR is central in factor analysis (explaining shared variance via $X = D + L$ 5 and idiosyncratic variance via $X = D + L$ 6), high-dimensional covariance/precision estimation, and modern randomized NLA, where it offers both expressive statistical power and computational tractability (Saunderson et al., 2012, Wu et al., 2018, Yeon et al., 18 Dec 2025).

2. Convex Formulations and Recovery Algorithms

A core approach to DPLR decomposition, especially when $X = D + L$ 7 is positive semidefinite, is Minimum Trace Factor Analysis (MTFA) (Saunderson et al., 2012): $X = D + L$ 8 By relaxing the nonconvex rank function to the trace, this program becomes an SDP. The dual of MTFA optimizes over the elliptope $X = D + L$ 9, linking DPLR recovery to the geometry of correlation matrices.

Further, in the low-rank-plus-sparse setting—of which DPLR is the diagonal-sparse specialization—projected gradient descent with double thresholding yields linear convergence under restricted strong convexity/smoothness and structural Lipschitz gradient conditions (Zhang et al., 2017).

For explicit matrix approximation, the “Alt” algorithm alternates between low-rank projection (via truncated eigendecomposition or Nyström sketching) and diagonal update (Yeon et al., 18 Dec 2025). Heuristic or model-driven penalty formulations can select the effective rank in applications such as high-dimensional covariance selection (Wu et al., 2018).

For matrix operators accessible only through matrix-vector products, convex-sketching-based approaches like Sketchlord solve

$D\in\mathbb{R}^{n\times n}$ 0

to jointly recover $D\in\mathbb{R}^{n\times n}$ 1 and $D\in\mathbb{R}^{n\times n}$ 2 (Fernandez et al., 28 Sep 2025).

3. Structural Identifiability and Theoretical Characterization

DPLR uniqueness and recovery depend critically on the interaction of the low-rank subspace and the support of the diagonal. For MTFA, the coherence

$D\in\mathbb{R}^{n\times n}$ 3

provides a sharp threshold: if $D\in\mathbb{R}^{n\times n}$ 4, then for any $D\in\mathbb{R}^{n\times n}$ 5 and $D\in\mathbb{R}^{n\times n}$ 6 with $D\in\mathbb{R}^{n\times n}$ 7, $D\in\mathbb{R}^{n\times n}$ 8 is the unique optimum (Saunderson et al., 2012). This result is tight.

A fundamental equivalence exists between:

Exact DPLR recovery by MTFA,
The realizability of a subspace as a face of the elliptope,
The existence of a centered ellipsoid passing exactly through prescribed points (ellipsoid fitting).

In the sketching context, Sketchlord’s convex relaxation achieves exact recovery under appropriate injectivity and rank conditions on the randomized sketch (Fernandez et al., 28 Sep 2025).

Statistical consistency is established for covariance/precision estimation under DPLR models, with convergence rates depending on $D\in\mathbb{R}^{n\times n}$ 9 and $L\succeq 0$ 0; blockwise coordinate descent converges to the global optimum under mild conditions (Wu et al., 2018).

4. Computational Complexity and Fast Algorithms

DPLR structure enables computational operations far more efficiently than generic dense approaches. The Sherman–Morrison–Woodbury (SMW) formula allows matrix inversion: $L\succeq 0$ 1 with determinant

$L\succeq 0$ 2

Matrix–vector multiplication is $L\succeq 0$ 3, matrix solve and inversion are $L\succeq 0$ 4, making DPLR practical for large $L\succeq 0$ 5 when $L\succeq 0$ 6 is moderate (Chandrasekaran et al., 2018).

Randomized and sketch-based methods further reduce complexity. For an operator $L\succeq 0$ 7, forming $L\succeq 0$ 8 and optimizing over a low-dimensional sketch leads to per-iteration cost $L\succeq 0$ 9 with $L$ 0 (Fernandez et al., 28 Sep 2025). In large-scale covariance estimation, randomized Alt alternates between sketched low-rank recovery and stochastic diagonal estimation, achieving error bounds and convergence in mat–vec budget far smaller than $L$ 1 (Yeon et al., 18 Dec 2025).

Blockwise coordinate-descent for DPLR precision/covariance estimation has per-iteration cost dominated by one $L$ 2 eigendecomposition and an inexpensive diagonal SDP (Wu et al., 2018). In Riccati-like ODEs, dynamical low-rank plus diagonal approximation yields $L$ 3 per-timestep complexity for dimension $L$ 4 and rank $L$ 5 (Bonnabel et al., 2024).

5. Applications in Statistical Modeling, Machine Learning, and Scientific Computing

DPLR models underlie several major modeling and algorithmic paradigms:

Factor Analysis and Correlation Models: Decomposition of covariance matrices into shared factor structure plus unique variances is naturally encoded in DPLR (Saunderson et al., 2012).
High-Dimensional Covariance Estimation: DPLR structure yields improved Kullback-Leibler loss and Sharpe ratios in finance and portfolio optimization (Wu et al., 2018).
Recommendation Systems: Field-weighted factorization machines with DPLR interaction matrices yield efficient and accurate inference at large scale, outperforming heuristic-pruned models (Shtoff et al., 2024).
Large-Scale Operator Compression: DPLR sketching algorithms enable high-fidelity surrogates for deep learning Hessians and scientific computing operators, outperforming sequential or pure low-rank/diagonal approaches (Fernandez et al., 28 Sep 2025).
Matrix Differential Equations: Dynamical DPLR approximations guarantee full rank and tractable inversion in high-dimensional Riccati and Kalman filtering flows, outperforming pure low-rank methods (Bonnabel et al., 2024).
Polynomial Eigenvalue Problems: Hessenberg reduction for DPLR matrices via quasiseparable technology accelerates eigenvalue computations and structured QR iterations (Bini et al., 2015).

6. Numerical Stability and Practical Trade-offs

DPLR algorithms benefit from numerical stability when the diagonal is well-conditioned and low-rank updates do not dominate the diagonal component. Inversion and determinant computations reduce to dense $L$ 6 matrix manipulations, and orthogonality of $L$ 7/ $L$ 8 can be maintained via QR factorizations (Chandrasekaran et al., 2018). Quasiseparable Hessenberg reduction preserves backward stability in eigenvalue computations, and warm-started blockwise coordinate descent accelerates convergence across model ranks (Bini et al., 2015, Wu et al., 2018).

The choice of rank $L$ 9 and corresponding parameter trade-offs (accuracy, parameter count, latency) is application-specific. In recommendation systems, empirical studies indicate $L=U U^T$ 0 as low as $L=U U^T$ 1– $L=U U^T$ 2 suffice for near-optimal predictive performance with substantial efficiency gains (Shtoff et al., 2024). In operator sketching, sketch dimension $L=U U^T$ 3 is empirically effective for high-quality low-rank recovery, with parameter $L=U U^T$ 4 tuning driving the bias–variance trade-off (Fernandez et al., 28 Sep 2025).

7. Generalizations and Extensions

Extensions include block-diagonal plus low-rank models (for structured covariance), banded-plus-low-rank models for time series, and adaptive-rank DPLR approaches where $L=U U^T$ 5 is varied dynamically. The DPLR structure also generalizes to matrix polynomials, large-scale kernel learning, and online/streaming settings via randomized sketching and dynamic updates (Yeon et al., 18 Dec 2025, Bini et al., 2015). Theoretical links to sparse plus low-rank (“Robust PCA”) suggest broader applicability whenever a dominant global structure coexists with localized (diagonal or block-diagonal) effects (Zhang et al., 2017).

A plausible implication is that as data modalities and operator sizes continue to grow, the DPLR framework—offering both expressive modeling and scalable computation—will remain central to advances in both theory and methodological innovation across statistics, machine learning, numerical linear algebra, and scientific computing.