Papers
Topics
Authors
Recent
Search
2000 character limit reached

ALS for Low-Rank Matrix Reconstruction

Updated 8 March 2026
  • Alternating Least Squares (ALS) is an iterative algorithm used to solve low-rank matrix and tensor reconstruction problems by alternating optimizations of variable subsets.
  • It leverages quadratic, separable subproblems through factorization into L and R, achieving efficient convergence especially under Gaussian noise conditions.
  • ALS adapts to structured constraints like Hankel and PSD projections, demonstrating near-optimal empirical performance relative to Cramér–Rao bounds in undersampled settings.

Alternating Least Squares (ALS) is a fundamental iterative method for solving low-rank matrix and tensor reconstruction problems from linear or structured observations under constraints such as low-rank, subspace membership, or semidefiniteness. Its core utility arises from efficiently alternating between optimizing subsets of variables, leveraging the separability of the least-squares objective in the chosen parametrization. ALS plays a central role in a wide range of applications, including compressed sensing, matrix completion, and multidimensional data analysis.

1. Optimization Framework and Objective Formulation

ALS addresses the recovery of an unknown n×pn \times p matrix XX (or higher-dimensional arrays) of rank rmin(n,p)r \ll \min(n, p) from underdetermined, noisy linear measurements,

y=A(X)+n,yCm,m<np,y = \mathcal{A}(X) + n, \quad y \in \mathbb{C}^m, \quad m < np,

where A:Cn×pCm\mathcal{A}: \mathbb{C}^{n \times p} \to \mathbb{C}^m is a known sensing operator and nN(0,C)n \sim \mathcal{N}(0, C). Equivalently, y=Avec(X)+ny = A\,\mathrm{vec}(X) + n for some ACm×npA \in \mathbb{C}^{m \times np}.

Under Gaussian noise, the maximum-likelihood estimator is

X^=argminX:rank(X)=rC1/2[yA(X)]22,\hat{X} = \arg\min_{X: \operatorname{rank}(X) = r} \| C^{-1/2}[y-\mathcal{A}(X)] \|_2^2,

and for white noise (C=σ2IC = \sigma^2 I) this reduces to a least-squares cost

X^=argminXXrJ(X),J(X)=yAvec(X)22.\hat{X} = \arg\min_{X \in \mathcal{X}_r}\, J(X), \quad J(X) = \| y - A\,\mathrm{vec}(X) \|_2^2.

Enforcing the rank constraint via a factorization X=LRX = L R, LCn×rL \in \mathbb{C}^{n \times r}, RCr×pR \in \mathbb{C}^{r \times p}, yields an objective

J(L,R)=yA(IpL)vec(R)22=yA(RTIn)vec(L)22.J(L, R) = \| y - A\,(I_p \otimes L)\,\mathrm{vec}(R) \|_2^2 = \| y - A\,(R^T \otimes I_n)\,\mathrm{vec}(L) \|_2^2.

This structure is especially advantageous for low-rank problems, as each subproblem is quadratic in LL or RR individually.

2. Alternating Minimization Algorithm

ALS exploits the fact that J(L,R)J(L, R) is quadratic and separable in LL and RR:

  • Update RR with LL fixed:

R^=matr,p{[A(IpL)]y},\hat{R} = \operatorname{mat}_{r, p} \{ [A\,(I_p \otimes L)]^{\dagger}\,y \},

i.e., vec(R^)=[A(IpL)]y\mathrm{vec}(\hat{R}) = [A\,(I_p \otimes L)]^{\dagger}\,y.

  • Update LL with RR fixed:

L^=matn,r{[A(RTIn)]y},\hat{L} = \operatorname{mat}_{n, r} \{ [A\,(R^T \otimes I_n)]^{\dagger}\,y \},

i.e., vec(L^)=[A(RTIn)]y\mathrm{vec}(\hat{L}) = [A\,(R^T \otimes I_n)]^{\dagger}\,y.

Here, {}^\dagger denotes the Moore–Penrose pseudoinverse. The linear systems involved are of manageable size when rr is small, making ALS computationally efficient for moderate nn, pp, and rr (Zachariah et al., 2012).

Algorithmic Implementation (Pseudocode)

Initialization:

  • Form Z=matn,p(Ay)Z = \operatorname{mat}_{n, p}(A^*y),
  • Compute the top-rr SVD: [U0,Σ0,V0]=svd_trunc(Z,r)[U_0, \Sigma_0, V_0] = \mathtt{svd\_trunc}(Z, r),
  • Set LU0Σ0L \gets U_0 \Sigma_0.

Main loop:

  • Repeat until J(L,R)J(L, R) no longer decreases:

    1. Rmatr,p{[A(IpL)]y}R \gets \operatorname{mat}_{r, p}\{ [A\,(I_p \otimes L)]^\dagger y \}
    2. (If structure: X~=LRX̃ = L R, Xˉ=P(X~)X̄ = \mathcal{P}(X̃), RLXˉR \gets L^\dagger X̄)
    3. Lmatn,r{[A(RTIn)]y}L \gets \operatorname{mat}_{n, r} \{ [A\,(R^T \otimes I_n)]^\dagger y \}
    4. (If structure: X~=LRX̃ = L R, Xˉ=P(X~)X̄ = \mathcal{P}(X̃), LXˉRL \gets X̄ R^\dagger)
  • Output X^=LR\hat X = L R.

3. Incorporation of Structural Constraints

ALS is adaptable to situations where XX is known a priori to belong to a linear subspace XS\mathcal{X}_S (e.g., Hankel, Toeplitz) or to the positive semidefinite cone X+\mathcal{X}_+.

  • Linear Structure: After forming X~=LR\tilde{X} = L R, project onto the subspace parameterized by SCnp×qS \in \mathbb{C}^{np \times q}:

θ^=Svec(X~),vec(Xˉ)=Sθ^.\hat{\theta} = S^\dagger\,\mathrm{vec}(\tilde{X}), \quad \mathrm{vec}(\bar{X}) = S \hat{\theta}.

Re-solve for RR (or LL) to minimize LRXˉF2\| L R - \bar{X} \|_F^2 via the closed-form update RLXˉR \gets L^\dagger \bar{X} (or LXˉRL \gets \bar{X} R^\dagger).

  • Positive Semidefiniteness: Symmetrize and project using spectral decomposition,

(X~+X~)/2=VΛV,Xˉ=VrΛrVr,(\tilde{X}+\tilde{X}^*)/2 = V \Lambda V^*, \quad \bar{X} = V_r \Lambda_r V_r^*,

with VrV_r, Λr\Lambda_r the eigenvectors and top rr eigenvalues. Update RR, LL (as above) onto Xˉ\bar{X} (Zachariah et al., 2012).

Structural steps add projection costs—O(npq)O(np\,q) for linear subspaces, O(n3)O(n^3) for PSD projections—while preserving computational feasibility for moderate scale.

4. Convergence Properties and Computational Complexity

Each ALS iteration monotonically nonincreases the objective J(L,R)J(L, R), ensuring convergence to a stationary point (which can be a local minimum rather than the global optimum). The per-iteration cost is dominated by solving two linear least-squares problems of size m×(rp)m \times (rp) and m×(nr)m \times (nr):

  • Pseudoinverse computation for RR: O(m(rp)2+(rp)3)O(m (rp)^2 + (rp)^3),
  • Pseudoinverse for LL: O(m(nr)2+(nr)3)O(m(nr)^2 + (nr)^3),

and, if needed, O(npq)O(np q) for subspace projection, or O(n3)O(n^3) for PSD projection (Zachariah et al., 2012).

ALS thus remains tractable for moderate rank rr and moderate mnpm \ll np.

5. Empirical Performance Relative to Cramér–Rao Bounds

Extensive simulations for n=p=100n = p = 100, true rank r=3r = 3, and varying sample complexity m=ρnpm = \lceil \rho n p \rceil with ρ(0,1]\rho \in (0,1], demonstrate:

  • Unstructured ALS achieves performance within 0\sim 0 dB of the unstructured Cramér–Rao bound (CRB) across signal-to-noise ratios (SMNRs).
  • Structured (Hankel, PSD) ALS yields significant gains, e.g., 2.75\sim 2.75 dB (Hankel) and $0.77$ dB (PSD) performance improvements at SMNR =10= 10 dB.
  • Sample complexity: For ρ0.1\rho \gtrsim 0.1, ALS approaches the CRB; for ρ0.1\rho \lesssim 0.1, the unstructured CRB itself becomes loose.

This confirms that ALS is effective in general undersampled matrix reconstruction, with a-priori structure markedly narrowing the gap to the corresponding CRB (Zachariah et al., 2012).

6. Extensions and Practical Considerations

  • Initialization: SVD-based spectral methods provide effective initializations that accelerate convergence.
  • Applicability: The ALS paradigm generalizes directly to tensor decompositions (CP, Tucker, etc.) and can be adapted to handle more general nonconvex regularization or missing-data formulations.
  • Limitation: Convergence is only to a stationary point; global optimality is not guaranteed except in specific cases (e.g. certain rank-one tensor approximations or highly overdetermined settings).
  • Structural constraints: The insertion of projection steps allows ALS to exploit domain knowledge (e.g., harmonic or positive semidefinite structure) without compromising the alternating minimization efficiency.

7. Summary Table: ALS Methodology for Low-Rank Matrix Recovery

Aspect Description Complexity
Objective minX:rankX=ryAvec(X)22\min_{X: \operatorname{rank} X = r} \|y - A\,\mathrm{vec}(X)\|_2^2 (white noise, rank constraint) O(mnp)O(mnp) (matrix size)
Parametrization X=LRX = L R, LCn×rL \in \mathbb{C}^{n \times r}, RCr×pR \in \mathbb{C}^{r \times p} O(nr+rp)O(nr + rp) (parameter dims)
Update for RR R^=matr,p{[A(IpL)]y}\hat{R} = \operatorname{mat}_{r, p}\{ [A(I_p \otimes L)]^\dagger y \} O(m(rp)2+(rp)3)O(m (rp)^2 + (rp)^3)
Update for LL L^=matn,r{[A(RTIn)]y}\hat{L} = \operatorname{mat}_{n, r}\{ [A(R^T \otimes I_n)]^\dagger y \} O(m(nr)2+(nr)3)O(m (nr)^2 + (nr)^3)
Structure projection (linear) Project X~\tilde{X} onto SS, recompute LL, RR by minimizing Frobenius norm error (closed-form updates) O(npq)O(npq)
Structure projection (PSD) Project X~\tilde{X} onto the leading-rr eigenspace (symmetrization, eigen-decomposition), update LL, RR by least-squares onto projected Xˉ\bar{X} O(n3)O(n^3)

ALS thus provides a robust, extensible, and computationally efficient framework for low-rank matrix estimation in both unstructured and structured scenarios, with predictable convergence properties and favorable empirical performance against fundamental information-theoretic limits (Zachariah et al., 2012).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Alternating Least Squares (ALS).