Fast and accurate optimization on the orthogonal manifold without retraction (2102.07432v2)

Published 15 Feb 2021 in stat.ML, cs.LG, and stat.CO

Abstract: We consider the problem of minimizing a function over the manifold of orthogonal matrices. The majority of algorithms for this problem compute a direction in the tangent space, and then use a retraction to move in that direction while staying on the manifold. Unfortunately, the numerical computation of retractions on the orthogonal manifold always involves some expensive linear algebra operation, such as matrix inversion, exponential or square-root. These operations quickly become expensive as the dimension of the matrices grows. To bypass this limitation, we propose the landing algorithm which does not use retractions. The algorithm is not constrained to stay on the manifold but its evolution is driven by a potential energy which progressively attracts it towards the manifold. One iteration of the landing algorithm only involves matrix multiplications, which makes it cheap compared to its retraction counterparts. We provide an analysis of the convergence of the algorithm, and demonstrate its promises on large-scale and deep learning problems, where it is faster and less prone to numerical errors than retraction-based methods.

Citations (23)

View on Semantic Scholar

Summary

The paper introduces the Landing Algorithm, which avoids traditional retraction steps by using a potential energy term to enforce convergence on the orthogonal manifold.
It employs simple matrix multiplications instead of costly linear algebra operations, leading to faster convergence and reduced numerical errors.
Empirical results in deep learning and high-dimensional settings demonstrate its superior performance compared to conventional approaches.

Fast and Accurate Optimization on the Orthogonal Manifold without Retraction

The paper under discussion presents a novel optimization algorithm, named the Landing Algorithm, designed for solving optimization problems over the manifold of orthogonal matrices. In contrast to traditional methods that rely on computationally expensive retractions, this algorithm employs a potential energy approach to enforce convergence towards the orthogonal manifold. The work offers a rigorous analysis of both theoretical and practical aspects of the proposed methodology, promising advancements in large-scale and deep learning scenarios.

Overview

The optimization problem addressed is the minimization of a differentiable function defined on an orthogonal manifold, specifically the set of orthogonal matrices. This category of optimization problems frequently surfaces in various applications, such as principal component analysis, independent component analysis, and deep learning, where orthogonal constraints are beneficial for stable and efficient model training. Traditional algorithms often utilize a retraction step to project the solution iteratively onto the manifold, which typically involves computationally expensive operations like matrix inversion, square roots, or exponentials.

The Landing Algorithm

The Landing Algorithm is designed to circumvent the computational bottleneck associated with retraction steps. Instead of retractions, the algorithm strategically deviates from the manifold while incorporating a potential energy term that gradually attracts the iterates towards the manifold. One complete iteration of the landing algorithm incorporates mostly matrix multiplications, significantly reducing computational overhead compared to traditional retraction methods.

Convergence and Efficiency

The key properties of the Landing Algorithm include:

Orthogonalization: The algorithm ensures that as iterations progress, the distance to the orthogonal manifold diminishes, landing precisely on it in the limit.
Simplicity of Update Rule: Each update involves simpler matrix multiplications without the need for expensive linear algebra operations.
Robustness: The approach exhibits reduced numerical errors, particularly advantageous in low-precision computations common in modern deep learning frameworks.

Numerical Results

Empirical results underscore the competitive edge of the Landing Algorithm over traditional methods, particularly in high-dimensional settings where retraction computations dominate execution time. Experiments conducted in both small-scale matrix problems and large-scale deep learning contexts (such as training neural networks with orthogonal constraints) demonstrate that the Landing Algorithm offers faster convergence and superior precision in maintaining orthogonality.

Implications and Future Work

The introduction of the Landing Algorithm is poised to impact multiple domains that require optimization over the orthogonal group. Theoretically, it provides an alternative approach to manifold optimization that bypasses the need for retraction mappings, opening new possibilities in algorithm design for constrained optimization problems.

Speculation on Future Directions:

The principles underlying the Landing Algorithm could be extended to more general Riemannian manifolds, potentially enriching the toolbox available for manifold optimization.
Theoretical advancements might shed light on further accelerating convergence rates or adapting the algorithm to stochastic settings, a typical scenario in deep learning.

In summary, this paper presents a compelling case for the Landing Algorithm as a viable and efficient alternative to traditional methods for orthogonality-constrained optimization. It thus sets a foundation for further exploration in both theoretical abstractions and practical implementations across diverse research fields.

PDF Markdown