ALS for Low-Rank Matrix Reconstruction

Updated 8 March 2026

Alternating Least Squares (ALS) is an iterative algorithm used to solve low-rank matrix and tensor reconstruction problems by alternating optimizations of variable subsets.
It leverages quadratic, separable subproblems through factorization into L and R, achieving efficient convergence especially under Gaussian noise conditions.
ALS adapts to structured constraints like Hankel and PSD projections, demonstrating near-optimal empirical performance relative to Cramér–Rao bounds in undersampled settings.

Alternating Least Squares (ALS) is a fundamental iterative method for solving low-rank matrix and tensor reconstruction problems from linear or structured observations under constraints such as low-rank, subspace membership, or semidefiniteness. Its core utility arises from efficiently alternating between optimizing subsets of variables, leveraging the separability of the least-squares objective in the chosen parametrization. ALS plays a central role in a wide range of applications, including compressed sensing, matrix completion, and multidimensional data analysis.

1. Optimization Framework and Objective Formulation

ALS addresses the recovery of an unknown $n \times p$ matrix $X$ (or higher-dimensional arrays) of rank $r \ll \min(n, p)$ from underdetermined, noisy linear measurements,

$y = \mathcal{A}(X) + n, \quad y \in \mathbb{C}^m, \quad m < np,$

where $\mathcal{A}: \mathbb{C}^{n \times p} \to \mathbb{C}^m$ is a known sensing operator and $n \sim \mathcal{N}(0, C)$ . Equivalently, $y = A\,\mathrm{vec}(X) + n$ for some $A \in \mathbb{C}^{m \times np}$ .

Under Gaussian noise, the maximum-likelihood estimator is

$\hat{X} = \arg\min_{X: \operatorname{rank}(X) = r} \| C^{-1/2}[y-\mathcal{A}(X)] \|_2^2,$

and for white noise ( $C = \sigma^2 I$ ) this reduces to a least-squares cost

$\hat{X} = \arg\min_{X \in \mathcal{X}_r}\, J(X), \quad J(X) = \| y - A\,\mathrm{vec}(X) \|_2^2.$

Enforcing the rank constraint via a factorization $X = L R$ , $L \in \mathbb{C}^{n \times r}$ , $R \in \mathbb{C}^{r \times p}$ , yields an objective

$J(L, R) = \| y - A\,(I_p \otimes L)\,\mathrm{vec}(R) \|_2^2 = \| y - A\,(R^T \otimes I_n)\,\mathrm{vec}(L) \|_2^2.$

This structure is especially advantageous for low-rank problems, as each subproblem is quadratic in $L$ or $R$ individually.

2. Alternating Minimization Algorithm

ALS exploits the fact that $J(L, R)$ is quadratic and separable in $L$ and $R$ :

Update $R$ with $L$ fixed:

$\hat{R} = \operatorname{mat}_{r, p} \{ [A\,(I_p \otimes L)]^{\dagger}\,y \},$

i.e., $\mathrm{vec}(\hat{R}) = [A\,(I_p \otimes L)]^{\dagger}\,y$ .

Update $L$ with $R$ fixed:

$\hat{L} = \operatorname{mat}_{n, r} \{ [A\,(R^T \otimes I_n)]^{\dagger}\,y \},$

i.e., $\mathrm{vec}(\hat{L}) = [A\,(R^T \otimes I_n)]^{\dagger}\,y$ .

Here, ${}^\dagger$ denotes the Moore–Penrose pseudoinverse. The linear systems involved are of manageable size when $r$ is small, making ALS computationally efficient for moderate $n$ , $p$ , and $r$ (Zachariah et al., 2012).

Algorithmic Implementation (Pseudocode)

Initialization:

Form $Z = \operatorname{mat}_{n, p}(A^*y)$ ,
Compute the top- $r$ SVD: $[U_0, \Sigma_0, V_0] = \mathtt{svd\_trunc}(Z, r)$ ,
Set $L \gets U_0 \Sigma_0$ .

Main loop:

Repeat until $J(L, R)$ $J (L, R)$ no longer decreases:
1. $R \gets \operatorname{mat}_{r, p}\{ [A\,(I_p \otimes L)]^\dagger y \}$
2. (If structure: $X̃ = L R$ , $X̄ = \mathcal{P}(X̃)$ , $R \gets L^\dagger X̄$ )
3. $L \gets \operatorname{mat}_{n, r} \{ [A\,(R^T \otimes I_n)]^\dagger y \}$
4. (If structure: $X̃ = L R$ , $X̄ = \mathcal{P}(X̃)$ , $L \gets X̄ R^\dagger$ )
Output $\hat X = L R$ .

3. Incorporation of Structural Constraints

ALS is adaptable to situations where $X$ is known a priori to belong to a linear subspace $\mathcal{X}_S$ (e.g., Hankel, Toeplitz) or to the positive semidefinite cone $\mathcal{X}_+$ .

Linear Structure: After forming $\tilde{X} = L R$ , project onto the subspace parameterized by $S \in \mathbb{C}^{np \times q}$ :

$\hat{\theta} = S^\dagger\,\mathrm{vec}(\tilde{X}), \quad \mathrm{vec}(\bar{X}) = S \hat{\theta}.$

Re-solve for $R$ (or $L$ ) to minimize $\| L R - \bar{X} \|_F^2$ via the closed-form update $R \gets L^\dagger \bar{X}$ (or $L \gets \bar{X} R^\dagger$ ).

Positive Semidefiniteness: Symmetrize and project using spectral decomposition,

$(\tilde{X}+\tilde{X}^*)/2 = V \Lambda V^*, \quad \bar{X} = V_r \Lambda_r V_r^*,$

with $V_r$ , $\Lambda_r$ the eigenvectors and top $r$ eigenvalues. Update $R$ , $L$ (as above) onto $\bar{X}$ (Zachariah et al., 2012).

Structural steps add projection costs— $O(np\,q)$ for linear subspaces, $O(n^3)$ for PSD projections—while preserving computational feasibility for moderate scale.

4. Convergence Properties and Computational Complexity

Each ALS iteration monotonically nonincreases the objective $J(L, R)$ , ensuring convergence to a stationary point (which can be a local minimum rather than the global optimum). The per-iteration cost is dominated by solving two linear least-squares problems of size $m \times (rp)$ and $m \times (nr)$ :

Pseudoinverse computation for $R$ : $O(m (rp)^2 + (rp)^3)$ ,
Pseudoinverse for $L$ : $O(m(nr)^2 + (nr)^3)$ ,

and, if needed, $O(np q)$ for subspace projection, or $O(n^3)$ for PSD projection (Zachariah et al., 2012).

ALS thus remains tractable for moderate rank $r$ and moderate $m \ll np$ .

5. Empirical Performance Relative to Cramér–Rao Bounds

Extensive simulations for $n = p = 100$ , true rank $r = 3$ , and varying sample complexity $m = \lceil \rho n p \rceil$ with $\rho \in (0,1]$ , demonstrate:

Unstructured ALS achieves performance within $\sim 0$ dB of the unstructured Cramér–Rao bound (CRB) across signal-to-noise ratios (SMNRs).
Structured (Hankel, PSD) ALS yields significant gains, e.g., $\sim 2.75$ dB (Hankel) and $0.77$ dB (PSD) performance improvements at SMNR $= 10$ dB.
Sample complexity: For $\rho \gtrsim 0.1$ , ALS approaches the CRB; for $\rho \lesssim 0.1$ , the unstructured CRB itself becomes loose.

This confirms that ALS is effective in general undersampled matrix reconstruction, with a-priori structure markedly narrowing the gap to the corresponding CRB (Zachariah et al., 2012).

6. Extensions and Practical Considerations

Initialization: SVD-based spectral methods provide effective initializations that accelerate convergence.
Applicability: The ALS paradigm generalizes directly to tensor decompositions (CP, Tucker, etc.) and can be adapted to handle more general nonconvex regularization or missing-data formulations.
Limitation: Convergence is only to a stationary point; global optimality is not guaranteed except in specific cases (e.g. certain rank-one tensor approximations or highly overdetermined settings).
Structural constraints: The insertion of projection steps allows ALS to exploit domain knowledge (e.g., harmonic or positive semidefinite structure) without compromising the alternating minimization efficiency.

7. Summary Table: ALS Methodology for Low-Rank Matrix Recovery

Aspect	Description	Complexity
Objective	$\min_{X: \operatorname{rank} X = r} \\|y - A\,\mathrm{vec}(X)\\|_2^2$ (white noise, rank constraint)	$O(mnp)$ (matrix size)
Parametrization	$X = L R$ , $L \in \mathbb{C}^{n \times r}$ , $R \in \mathbb{C}^{r \times p}$	$O(nr + rp)$ (parameter dims)
Update for $R$	$\hat{R} = \operatorname{mat}_{r, p}\{ [A(I_p \otimes L)]^\dagger y \}$	$O(m (rp)^2 + (rp)^3)$
Update for $L$	$\hat{L} = \operatorname{mat}_{n, r}\{ [A(R^T \otimes I_n)]^\dagger y \}$	$O(m (nr)^2 + (nr)^3)$
Structure projection (linear)	Project $\tilde{X}$ onto $S$ , recompute $L$ , $R$ by minimizing Frobenius norm error (closed-form updates)	$O(npq)$
Structure projection (PSD)	Project $\tilde{X}$ onto the leading- $r$ eigenspace (symmetrization, eigen-decomposition), update $L$ , $R$ by least-squares onto projected $\bar{X}$	$O(n^3)$

ALS thus provides a robust, extensible, and computationally efficient framework for low-rank matrix estimation in both unstructured and structured scenarios, with predictable convergence properties and favorable empirical performance against fundamental information-theoretic limits (Zachariah et al., 2012).

Markdown Report Issue Upgrade to Chat

References (1)

Alternating Least-Squares for Low-Rank Matrix Reconstruction (2012)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Alternating Least Squares (ALS).

ALS for Low-Rank Matrix Reconstruction

1. Optimization Framework and Objective Formulation

2. Alternating Minimization Algorithm

Algorithmic Implementation (Pseudocode)

3. Incorporation of Structural Constraints

4. Convergence Properties and Computational Complexity

5. Empirical Performance Relative to Cramér–Rao Bounds

6. Extensions and Practical Considerations

7. Summary Table: ALS Methodology for Low-Rank Matrix Recovery

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

ALS for Low-Rank Matrix Reconstruction

1. Optimization Framework and Objective Formulation

2. Alternating Minimization Algorithm

Algorithmic Implementation (Pseudocode)

3. Incorporation of Structural Constraints

4. Convergence Properties and Computational Complexity

5. Empirical Performance Relative to Cramér–Rao Bounds

6. Extensions and Practical Considerations

7. Summary Table: ALS Methodology for Low-Rank Matrix Recovery

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research