Diagonal-Plus-Low-Rank (DPLR) Updates
- DPLR updates are a matrix structure defined as D + UVáµ€, uniting diagonal and low-rank components to facilitate rapid inversion using identities like Woodbury.
- They are applied in high-dimensional covariance estimation, optimization, and online learning, using efficient algorithms such as block coordinate descent and Krylov methods.
- These updates support scalable matrix factorizations and filtering (e.g., Kalman filtering) by preserving structure and reducing computational cost in dynamic systems.
A diagonal-plus-low-rank (DPLR) structure refers to matrices or updates of the form , where is a diagonal matrix and are rectangular matrices with a small number of columns, resulting in a low-rank modification. This decomposition exploits the complementary properties of diagonal and low-rank components, unifying computational efficiency with model flexibility. DPLR updates appear across covariance estimation, matrix function evaluation, differential equations, efficient matrix factorizations, optimization, and large-scale learning settings. Their algorithmic utility is enabled by fast closed-form manipulation, efficient low-storage updates, and structural preservation through key mathematical identities.
1. Structural Formulation and Theoretical Properties
The canonical DPLR form is , with diagonal and , typically with ensuring low rank. For symmetric or positive semidefinite settings, yields . The structure is preserved under addition of rank- modifications, through the identity 0, or by appropriate downdating formulas (March et al., 2020). Inverse computations leverage the Woodbury identity: 1 enabling 2 cost for inversion and matrix-vector products (Bonnabel et al., 2024). In high-dimensional positive definite cases, DPLR matrices are always full-rank if 3, and support efficient updates of LDL factorization and related decompositions (March et al., 2020).
2. Algorithms for DPLR Updates and Optimization
DPLR structures admit specialized algorithms in both static and streaming contexts:
- Covariance/Precision Matrix Estimation: The DPLR estimator for the precision matrix 4, constrains 5 with 6 and 7 diagonal, leading to the estimator via
8
solved by blockwise coordinate-descent updates: fixing 9, update 0 via eigen-decomposition; fixing 1, update 2 by solving a log-det SDP. This achieves statistical consistency at minimax rates and supports both fixed-rank and penalized-rank selection strategies (Wu et al., 2018).
- Streaming Covariance Updates and LDL Factorization: Sequential rank-3 updates or downdates to 4 employ the identities
5
Together with direct in-place LDL factor modification algorithms, these enable 6 update cost, a substantial reduction over the naïve full recomputation (March et al., 2020). This facilitates efficient management of covariance matrices in Bayesian online and sliding window contexts.
- Dynamics on DPLR Manifolds: To constrain a matrix ODE 7 to the DPLR manifold, the time derivative is projected onto the tangent space:
8
An efficient orthogonal projection yields vector field updates, enabling fast Riccati-type ODEs, Wasserstein flow, and Kalman filtering for high-dimensional systems, all at linear or near-linear storage and computation (Bonnabel et al., 2024).
3. Efficient Matrix Function Updates in DPLR Contexts
When a matrix function 9 must be evaluated after a DPLR modification, specialized Krylov subspace methods yield low computational cost:
- Polynomial and Rational Krylov Updates: For 0, block Krylov subspace projections provide low-rank compressed approximations with superlinear convergence:
1
Orthonormal bases 2 are built from repeated application of 3 or 4 to 5, exploiting the diagonal structure for cheap multiplications (Beckermann et al., 2017, Beckermann et al., 2020). For analytic 6, convergence is dictated by the best polynomial or rational approximation error on the spectral domain, with exponential error decay for exponential and Markov functions.
- Matrix Square Roots and Inverse Roots: Updates to 7 or 8 are formulated via Riccati equations for the correction term 9, solved by low-rank Riccati solvers. Eigenvalue decay of the correction is geometric in the number of columns of 0, permitting accurate low-rank representations (Shumeli et al., 2022).
- Sylvester Equations and ParaDiag Matrix Equations: DPLR modifications in the context of discretized PDEs and all-at-once methods are handled through Sherman–Morrison–Woodbury identities and tensor-Krylov corrections, or by interpolation in the update parameter, enabling rapid solution of perturbed matrix equations and preconditioning strategies (Kressner et al., 2022).
4. Structured Matrix Factorization and Spectral Algorithms
DPLR matrices facilitate fast factorization methods by exploiting quasiseparable structure:
- Hessenberg Reduction: For 1, the matrix is 2-quasiseparable (each strictly lower/upper triangular block has rank ≤3), allowing a Hessenberg reduction in 4 time. At each step, the Givens-vector (GV) representation and generator matrices are updated efficiently, and the structure is preserved through nested block updates (Bini et al., 2015). This yields backward-stable reductions and underpins fast eigenvalue solvers and determinant evaluations in polynomial eigenproblems.
| Problem | Classical Cost | DPLR-aware Cost |
|---|---|---|
| Matrix inversion | 5 | 6 |
| Hessenberg reduction | 7 | 8 |
| Krylov function eval | 9 | 0 |
5. Practical Applications and Statistical Performance
DPLR updates are applied across diverse domains:
- High-dimensional Covariance Estimation: DPLR estimators outperform pure sparse or diagonal methods in Kullback-Leibler loss for models with low-rank plus diagonal structure, and yield improved Sharpe ratios in Markowitz portfolio optimization when plugged into empirical finance pipelines (Wu et al., 2018).
- Online and Streaming Learning: DPLR-based LDL-updates allow efficient, scalable computation for dynamic datasets, relevant in streaming PCA and Bayesian learning (March et al., 2020).
- Recommender Systems: DPLR approximations of large parameter matrices in field-weighted factorization machines (FwFMs) drastically reduce inference latency (1 vs. 2), without sacrificing predictive accuracy. Empirical studies show DPLR surpasses parameter-pruning strategies for the same complexity budget, validated in both public benchmark and production ad-serving systems (Shtoff et al., 2024).
- Kalman Filtering and Gaussian Variational Inference: The DPLR–projected Riccati flows produce covariance updates with linear storage, yielding state-estimation accuracy superior to pure low-rank approaches (Bonnabel et al., 2024).
6. Error Analysis, Stability, and Numerical Constraints
Rigorous error and convergence theories underlie DPLR update methods:
- Spectral and Frobenius Error Bounds: Backward and forward error in DPLR-updated functions and matrix roots are controlled via residuals of the small projected systems; geometric eigenvalue decay ensures that only a modest rank is needed for accurate approximation (Shumeli et al., 2022, Beckermann et al., 2017, Beckermann et al., 2020).
- Stability Analyses: All Givens-based transformations in DPLR Hessenberg reduction are stable, and small non-orthogonal corrections apply only to 3 subproblems. Empirical studies show that backward error is typically 4 and can be controlled by periodic re-orthogonalization (Bini et al., 2015). In streaming or downdate scenarios, positivity and definiteness are maintained if the number of removed samples does not result in singularities (March et al., 2020).
- Rank Selection and Penalty Tuning: For statistical estimators, cross-validation and AIC-type penalties guide optimal rank choice. The coordinate-descent convergence is monotonic; in practice multiple initializations and warm-starts mitigate the inherent non-convexity of the rank constraint (Wu et al., 2018).
7. Limitations and Extensions
DPLR methods rely on the assumption that the dominant structure in the data can be well-approximated by low-rank plus diagonal patterns. Performance degrades if the underlying matrix does not exhibit rapid spectral decay or if true interactions are highly non-separable by field, rank, or variable. In non-symmetric or indefinite cases, additional regularization or constraint mechanisms are required. Ongoing extensions include adaptation to factorization machines with complex interaction graphs, online and parallel numerical linear algebra for time-dependent PDEs, and deeper integration with control, system identification, and probabilistic inference.
References:
- High-dimensional covariance matrix estimation using a low-rank and diagonal decomposition (Wu et al., 2018)
- Quasiseparable Hessenberg reduction of real diagonal plus low rank matrices and applications (Bini et al., 2015)
- Low-rank plus diagonal approximations for Riccati-like matrix differential equations (Bonnabel et al., 2024)
- Efficiently updating a covariance matrix and its LDL decomposition (March et al., 2020)
- Low Rank Field-Weighted Factorization Machines for Low Latency Item Recommendation (Shtoff et al., 2024)
- Improved ParaDiag via low-rank updates and interpolation (Kressner et al., 2022)
- Low-rank updates of matrix functions (Beckermann et al., 2017)
- Low-Rank Updates of Matrix Square Roots (Shumeli et al., 2022)
- Low-rank updates of matrix functions II: Rational Krylov methods (Beckermann et al., 2020)