Inexact Orthogonalized Update

Updated 28 October 2025

Inexact orthogonalized updates are techniques in optimization that approximate expensive orthogonalization steps via iterative methods.
They employ controlled error criteria and preconditioning to balance computational cost with convergence and numerical stability.
These methods apply to coordinate descent, stochastic optimization, and manifold settings, reducing complexity in high-dimensional problems.

An inexact orthogonalized update refers to an algorithmic step in which a computationally expensive orthogonalization or projection operation (such as the exact solution of a constrained or linear subproblem, often required in classical optimization/learning updates) is approximated by an iterative or otherwise non-exact method, thereby controlling computational cost while maintaining acceptable accuracy and theoretical guarantees. This paradigm has become central in large-scale optimization, particularly where updates are required to (approximately) satisfy orthogonality, block-independence, or subspace-related geometric constraints, and exact operations lead to prohibitive complexity.

1. Principles of Inexact Updates in Optimization Frameworks

In the context of coordinate descent, stochastic optimization, and projected/quadratic methods, a typical algorithm requires solving an update direction either by exactly minimizing a quadratic surrogate or projecting onto a subspace. In the ideal setting, the update is calculated as

$T_0^{(i)}(x) = \arg\min_{t \in \mathbb{R}^{N_i}} V_i(x, t),$

where $V_i(x, t)$ encodes a local quadratic approximation and any possible separable nonsmooth terms (see (Tappenden et al., 2013)). However, exact minimization of this subproblem is infeasible or too expensive in high dimension or with complex structure. The inexact update paradigm relaxes this to

$V_i(x, T^{(i)}_\delta(x)) \leq \min\left\{ V_i(x, 0),\ \delta^{(i)} + \min_t V_i(x, t) \right\},$

with suboptimality parameter $\delta^{(i)}$ . The per-block or per-coordinate inexactness can be controlled globally via a rule such as

$\sum_{i=1}^n p_i \delta^{(i)} \leq \alpha (F(x) - F^*) + \beta,$

with nonnegative $\alpha, \beta$ .

This strategy is direct for block or coordinate descent, but the same logic governs the transition to inexact orthogonalized updates: seek to compute an update with certain geometric (e.g., orthogonality) properties up to an allowable error, measured with respect to the relevant local objective or constraint.

2. Structure and Properties of Inexact Orthogonalized Updates

Orthogonalized updates commonly arise in algorithms that impose or exploit matrix orthogonality constraints (e.g., optimization on the Stiefel manifold, k-PCA, orthogonal deep learning layers), or that aim to maintain conjugacy, independence, or block-structure across iterates. In their exact form, these updates require expensive operations such as QR or SVD factorizations, or the solution of large, dense linear subproblems.

The inexact orthogonalized update modifies this by:

Using an iterative approximate solver (such as early-stopped conjugate gradient, Krylov subspace iteration, or an iterative matrix function), possibly with enforced partial orthogonalization (e.g., by reorthogonalizing against a reduced basis or within a subspace).
Terminating the inner loop according to a criterion that guarantees sufficient proximity to the exact update, often formulated as a primal-dual gap, residual, or quadratic decrease condition (see, e.g., (Tappenden et al., 2013, Eckstein et al., 14 Mar 2025)).
Optionally, incorporating preconditioning or basis transformation to accelerate convergence and stabilize the approximation (preconditioning and orthogonality are strongly linked in Krylov and multigrid methods).

Mathematically, this yields update rules of the abstract form

$\text{update} \approx \operatorname{argmin}_{u \in \mathcal{S}} \{\mathcal{L}(x, u)\} \quad \text{subject to} \quad \|\text{error}\| \leq \eta \|\text{ideal}\|,$

where $\mathcal{S}$ encodes an orthogonality constraint or structure, and $\eta$ is a forcing term governing the inexactness.

3. Error Control Criteria and Complexity Guarantees

A central contribution of the inexact update paradigm is the explicit linking of error control to convergence rates and complexity. Theoretical results ensure that, provided the inexactness is controlled according to specified rules, the method retains provable global convergence (and in some cases, superlinear local convergence).

For example, in randomized block or coordinate methods, expected function decrease per iteration under inexact updates satisfies a recurrence of the form

$\mathbb{E}[\xi_{k+1}\mid x_k] \leq (1+\alpha)\xi_k - \frac{\xi_k^2}{c} + \beta,$

with clear bounds on iteration complexity required to achieve $\epsilon$ -accuracy (Tappenden et al., 2013). Similarly, in global inexact frameworks (e.g., inexact augmented Lagrangian or projected Newton (Eckstein et al., 14 Mar 2025, Pötzl et al., 2022)), sufficient decrease is ensured by enforcing a relative error or subgradient criterion, for example:

$\| \Delta x(\omega) - \Delta s(\omega) \|_X \leq \eta \| \Delta x(\omega) \|_X$

or comparison against a Cauchy/proximal step.

When the update seeks partial or full orthogonality, an additional (approximate) orthogonality condition may be enforced. For instance, randomized sketch-and-project methods in (Loizou et al., 2019) prove that, if the inexact error $\epsilon_k$ is orthogonal (in the relevant $B$ -inner product) to the primary descent direction, then the error does not disrupt geometric convergence:

$\mathbb{E}[ \| x_k - x^* \|_B^2 ] \leq ((\sqrt{\rho} + q)^2)^k \| x_0 - x^* \|_B^2,$

for $q$ controlling the error norm and orthogonality.

4. Preconditioning and Orthogonalization in Inexact Update Algorithms

Preconditioning is often tightly coupled with orthogonalization. In large-scale quadratic or block-structured problems, preconditioning enhances the spectrum of the linearized subproblems solved approximately, thus yielding both faster convergence of the inner (inexact) iterations and improved numerical stability.

For block-angular or block-diagonal systems, explicit preconditioners $P \approx A_i^\top A_i$ (possibly perturbed for invertibility) are used in the inner iterative updates, and the orthogonality of the Krylov basis vectors is indirectly maintained, improving both the conditioning and the practical efficiency (Tappenden et al., 2013).

In settings where the update direction itself is required to be (approximately) orthogonal relative to past directions or a constraint set, as in certain Riemannian or manifold optimization methods, a partial or inexact orthogonalization is performed via randomized subspace projections (Han et al., 18 May 2025) or via approximate iterative methods on the manifold (without full retraction or QR/SVD).

5. Application Examples and Algorithmic Variants

Instances of the inexact orthogonalized update framework include:

Inexact block coordinate descent: Approximate block updates with preconditioning, where the solution direction may be further orthogonalized with respect to previous steps or subspaces (Tappenden et al., 2013).
Inexact stochastic sketch-and-project: Updates performed by projecting onto randomly sketched affine spaces, where the projection (ideally orthogonal) is approximated by an inner solver, and the error may be forcibly orthogonal to the main update direction (Loizou et al., 2019).
Inexact Newton and Proximal Newton methods in Hilbert spaces: Updates approximated under explicit relative or model-based inexactness criteria (Pötzl et al., 2022), easily transferred to block or orthogonality-constrained settings.
Inexact augmented Lagrangian methods: Subproblem solutions (e.g., dual proximal steps in ADMM-like methods) computed via accelerated proximal-gradient loops, with enforced relative (often model-based) error criteria and parameter-dependent multiplier relaxation (Eckstein et al., 14 Mar 2025).
Optimization with orthogonality constraints: Updates restricted to randomly sampled low-dimensional submanifolds; the computational cost is reduced by parameterizing the update as an action by a random orthogonal matrix on the current iterate (Han et al., 18 May 2025).

A spectrum of possible error criteria is used, including but not limited to:

Primal/dual gap conditions,
Model-based quadratic decrease comparisons,
Orthogonality or residual-based inner product constraints.

6. Tradeoffs, Limitations, and Strategy Design

The primary tradeoff in inexact orthogonalized updates is between per-iteration computational cost and the rate (and robustness) of convergence:

Tighter error tolerance or more nearly exact orthogonalization leads to faster ultimate convergence, but higher computational cost per update.
Looser error tolerance reduces complexity per update, but may slow convergence or require more outer iterations; extreme inexactness can, in the worst case, impair convergence guarantees.

Preconditioning and selection of stopping criteria are central to balancing this tradeoff: efficient preconditioners and adaptive error control (for example, proportional to function suboptimality) typically yield the best empirical performance.

Further, the structure of the problem (e.g., block diagonal, low rank, sparsity) profoundly impacts computation. For instance, problems with favorable block structure benefit particularly from preconditioned inexact updates, while high-dimensional orthogonality-constrained problems favor updates restricted to random submanifolds or implicit approximations to traditional retractions.

The extension to "inexact orthogonalized" updates rather than generic inexact updates is motivated when the goal is to ensure, up to error tolerance, that search directions remain decorrelated, orthogonal, or well-conditioned, which helps numerical stability and improves convergence in ill-conditioned or structured optimization landscapes.

7. Theoretical and Practical Significance

The inexact orthogonalized update framework generalizes classical exact-update paradigms, offering a flexible theoretical toolkit for scalable optimization. By quantifying and controlling inexactness—through per-block, global, or orthogonality-aware error criteria—these methods enable practitioners to trade precision for efficiency in algorithm design. Formal convergence and iteration-complexity results confirm that, as long as error tolerances are chosen judiciously, the advantages of exact update schemes are largely retained in practice.

This approach is widely applicable in convex programming, machine learning, signal processing, and large-scale scientific computing, especially where inner solves (projection, subproblem minimization, orthogonalization) would otherwise dominate the computation. The framework naturally informs the design of hybrid algorithms, adaptive solvers, and preconditioned or randomized updates with explicit or implicit geometric constraints.