Rank-One Tensor Updates
- Rank-One Tensor Updates are methods that approximate a tensor by iteratively adding or subtracting an outer product of vectors, providing a lower-rank representation.
- They underpin tensor decomposition algorithms such as ALS, ASVD, and their modified variants, offering efficient strategies for best rank-one approximations.
- These updates play a pivotal role in iterative deflation and tensor learning, where convergence, rank behavior, and theoretical guarantees are essential.
A rank-one tensor update is an operation wherein a tensor is approximated or modified by adding, subtracting, or iteratively updating a rank-one tensor—one expressible as an outer product of vectors, . Such updates are at the core of tensor decomposition methods, best-approximation algorithms, and algorithms that successively extract or deflate components from tensors. Rank-one updates generalize the matrix notion of rank-one corrections (for instance, the role played by singular vector pairs in SVD) to the multilinear tensor setting, with both algorithmic and theoretical distinctions from the matrix case. The following sections synthesize rigorous developments from analysis, algorithmics, and applications of rank-one updates in high-order tensor analysis.
1. Problem Formulation and Critical Points
Let be a real, order- tensor. The best rank-one approximation problem is formulated as
For fixed, unit-length , the optimal is the multilinear contraction . Thus, finding the best rank-one approximation reduces to
Critical points of satisfy coupled singular-vector equations: with 0 (Friedland et al., 2011).
Such critical tuples define the singular vector tuples of 1, which, under normalization, yield the singular value and critical approximation. These tuples are central in determining the stationary points of the Euclidean distance function to the variety of rank-one tensors (Ribot et al., 2024), and in the algebraic study of the stratification of tensor spaces (Horobet et al., 2023).
2. Algorithms for Best Rank-One Updates
Several alternating and all-at-once algorithms have been developed for finding or updating the best rank-one approximation.
Alternating Least Squares (ALS)
The standard ALS method cyclically optimizes over each factor 2 while holding the others fixed: 3 ALS generates a monotone, nondecreasing sequence and converges to a stationary point, though potentially sublinearly. The rate depends on local curvature: convergence can be sublinear, Q-linear, or even Q-superlinear for certain diagonal-structure cases (Espig et al., 2015).
Alternating Singular Value Decomposition (ASVD)
ASVD improves ALS by updating two components at a time through matricization and singular vector extraction. For third-order tensors, each micro-step computes leading singular vectors for mode-pair unfoldings and replaces the most incrementally beneficial pair: 4 ASVD is particularly advantageous for large tensors due to reduced optimization steps and fast local convergence (Friedland et al., 2011).
Modified ALS/ASVD (MALS/MASVD)
These variants select the globally best one-mode or two-mode coordinate update at each step, ensuring monotonicity and convergence to semi-maximal points: points that are maximal with respect to every mode or mode-pair coordinate. The guarantee is that any accumulation point is a 1-semi-maximum (MALS) or 2-semi-maximum (MASVD). These modifications incur roughly double the computational cost per iteration (Friedland et al., 2011).
All-at-Once Schemes and Parallel Updates
Recent schemes such as Levenberg–Marquardt (LM) and rotational updates (RORO) attack the full nonlinear least-squares problem for best rank-one approximation using all-mode gradient and Hessian information. These approaches enable parallel update of all components, critical for block-decomposition strategies and parallel computing platforms. Closed-form or polynomial root finding may be harnessed in small (e.g., 5) core dimensions. Variants also exist for updating multiple (e.g., three) modes simultaneously (Phan et al., 2017).
3. Iterative Rank-One Deflation and Decomposition Properties
A central application of rank-one updates in higher-order tensor decompositions is through iterative deflation:
- Compute best rank-one component of current residual.
- Subtract this component.
- Repeat until prescribed error or term limit.
However, unlike the matrix case (Eckart–Young theorem), for 6 tensors, naive critical rank-one subtraction need not decrease rank and can sometimes increase it (Horobet et al., 2023). The set of tensors for which successive deflation yields order-independent decompositions coincides with those admitting a two-orthogonal decomposition—each pair of components must be orthogonal in at least two modes. This is both necessary and sufficient for deflation order-independence (Ribot et al., 2024).
The structure of the set of tensors for which iterative critical rank-one updates terminate in rank-zero (after as many steps as the rank) has been characterized: in the symmetric case it coincides with the set of weakly-orthogonally decomposable tensors, and more generally relates to the stratification of the hyperdeterminant and conormal varieties (Horobet et al., 2023).
4. Role in High-Level Decomposition and Learning Algorithms
Rank-one update schemes are fundamental inner loops for more complex decomposition methods and tensor learning algorithms.
- In CANDECOMP/PARAFAC (CP) decompositions, rank-one update algorithms are used to sequentially extract components or as subroutines in block-coordinate or ADMM-style algorithms (Sterck, 2011, Phan et al., 2017).
- Higher-Order Matching Pursuit (HoMP) algorithms use rank-one selection in pursuit-style approaches. Each step greedily adds a new rank-one term, possibly reorthogonalizing previous atoms, to minimize cost functions associated with completion, regression, or multitask learning. Closed-form selection, aggressive step-size determination, and efficient approximations for best rank-one update are employed (Yang et al., 2015).
- Adaptive filtering and time-series models: use decomposable (rank-one) Volterra kernels and steepest-descent/LMS/TRUE-LMS updates in the filter parameters to estimate nonlinear system dynamics with vastly reduced parameterization and computational cost compared to full tensor models (Pinheiro et al., 2016).
The table below summarizes the complexity per step for leading methods:
| Method | Per-Iteration Cost | Targeted Property |
|---|---|---|
| ALS | 7 | Monotonic decrease |
| ASVD | 8 SVDs | Monotonic, faster |
| MALS, MASVD | 9 ALS/ASVD | Semi-maximal guarantee |
| All-at-once LM/RORO | 0 | Parallel, global step |
| HoMP | 1 (approx. SVD) | Linear convergence |
5. Theoretical Guarantees and Limitations
- Convergence: ALS and ASVD are globally convergent to stationary points under mild conditions; MASVD/MALS guarantee coordinate-wise semi-maximality (Friedland et al., 2011, Espig et al., 2015).
- Global optimality: Semidefinite relaxations via sum-of-squares (SOS) constraints provide certificates of global optimality when the relaxed solution is rank-one (Nie et al., 2013).
- Order-dependent failure: In higher-order tensors, deflation can fail to reduce rank or be order-dependent unless specific orthogonality criteria (two-orthogonality) are satisfied (Ribot et al., 2024, Horobet et al., 2023).
- Rate: Superlinear convergence is rare and occurs for special diagonal cases; generically linear or sublinear rates are observed (Espig et al., 2015).
- Noisy and overcomplete settings: For incoherent factors, alternating rank-one updates under suitable global initialization (SVD on random slices) achieve polynomial-time recovery up to moderate overcompleteness (Anandkumar et al., 2014).
6. Empirical Performance and Applications
Empirical results demonstrate that:
- ASVD scales best to large tensors, ALS/MALS may be faster on small problem instances. All methods attain comparable (Frobenius) approximation error; differences are evident mainly in computational efficiency (Friedland et al., 2011).
- HoMP achieves linear convergence with explicit rate bounds; few power-iterations suffice in practice to select an approximately best rank-one direction (Yang et al., 2015).
- In adaptive filtering contexts, decomposable (rank-one) kernel updates offer orders-of-magnitude gain in computational complexity without substantial loss in approximation quality, provided the underlying system is close to decomposable (Pinheiro et al., 2016).
- For parallel or distributed computing, all-at-once and block update schemes (e.g., PARO, LM, RORO) enable simultaneous updates of multiple rank-one terms—an essential property for large CP decompositions (Phan et al., 2017).
7. Implications and Future Directions
Rank-one tensor updates serve as foundational primitives for tensor approximation, decomposition, and learning across multilinear algebra, numerical analysis, signal processing, and machine learning. The variance in rank behavior under subtraction, absence of an exact analog of the Eckart–Young theorem, and the intricate algebraic-geometric structure of the underlying varieties motivate further exploration of efficient, globally optimal algorithms and robust theoretical guarantees for overcomplete, noisy, or order-dependent scenarios. Advances in semidefinite programming, tensor invariant theory, and randomized initialization strategies continue to expand the practical applicability and theoretical understanding of rank-one update processes.
References:
- Friedland, Lim, de La Harpe, "On best rank one approximation of tensors" (Friedland et al., 2011)
- Ribot, Horobet, Seigal, Teixeira Turatti, "Decomposing tensors via rank-one approximations" (Ribot et al., 2024)
- Domanov, De Lathauwer, "A Nonlinear GMRES Optimization Algorithm for Canonical Tensor Decomposition" (Sterck, 2011)
- Anandkumar et al., "Guaranteed Non-Orthogonal Tensor Decomposition via Alternating Rank-2 Updates" (Anandkumar et al., 2014)
- Espig, Khachatryan, "Convergence of Alternating Least Squares Optimisation for Rank-One Approximation to High Order Tensors" (Espig et al., 2015)
- Phan, Tichavský, Cichocki, "Best Rank-One Tensor Approximation and Parallel Update Algorithm for CPD" (Phan et al., 2017)
- Yang, Mehrkanoon, Suykens, "Higher order Matching Pursuit for Low Rank Tensor Learning" (Yang et al., 2015)
- Nie, Wang, "Semidefinite Relaxations for Best Rank-1 Tensor Approximations" (Nie et al., 2013)
- Comon et al., "When does subtracting a rank-one approximation decrease tensor rank?" (Horobet et al., 2023)
- Felix et al., "Nonlinear Adaptive Algorithms on Rank-One Tensor Models" (Pinheiro et al., 2016)