UTV-based TT Decomposition

Updated 28 November 2025

UTV-based TT Decomposition is a method that replaces traditional SVD with rank-revealing UTV factorizations to construct tensor trains with reduced computational cost.
It leverages both deterministic (ULV/URV) and randomized (randUTV) algorithms to maintain approximation quality while enhancing parallelism and facilitating early stopping.
Practical applications include color image compression and MRI tensor completion, achieving similar reconstruction quality to TT-SVD with significantly lower runtime.

UTV-based TT (Tensor Train) decomposition denotes a class of algorithms for computing tensor train representations of high-dimensional tensors by systematically replacing the SVD steps of traditional TT-SVD with (possibly randomized) rank-revealing UTV matrix factorizations. The key insight is that triangular factors in UTV can provide the necessary rank truncation and orthogonality properties at a fraction of the computational cost, particularly benefitting from superior parallelizability and early stopping features, while retaining essentially the same approximation quality as TT-SVD. This approach encompasses both deterministic (ULV/URV) and randomized (randUTV) factorizations, supporting both left- and right-orthogonal TT core extraction and flexible error control (Wang et al., 14 Jan 2025, Martinsson et al., 2017).

1. Tensor Train Decomposition and UTV Preliminaries

Let $\mathcal{A}\in\mathbb{R}^{I_1 \times \cdots \times I_d}$ be a $d$ -way tensor. The tensor train decomposition writes

$\mathcal{A}_{i_1\cdots i_d} = \sum_{\alpha_0,\dots,\alpha_d} \mathcal{G}^{(1)}_{\alpha_0,i_1,\alpha_1} \cdots \mathcal{G}^{(d)}_{\alpha_{d-1},i_d,\alpha_d},$

where $\alpha_0 = \alpha_d = 1$ , and $\mathcal{G}^{(k)}\in\mathbb{R}^{r_{k-1}\times I_k\times r_k}$ are the TT cores.

The classical TT-SVD computes these cores via a sequence of SVDs on unfolding matrices. At each step, the tensor is reshaped and the dominant singular vectors computed, extracting a core and progressing to the next mode.

The UTV decomposition of a matrix $A\in\mathbb{R}^{m\times n}$ is

$A = UTV^{\top}$

where $U\in\mathbb{R}^{m\times m}$ and $V\in\mathbb{R}^{n\times n}$ are orthonormal, and $T$ is triangular (upper- or lower-triangular: URV/ULV). For truncated or blockwise UTV—especially using randomized variants (e.g., randUTV)—the factorization can rapidly reveal approximate ranks and principal subspaces without full SVD overhead (Martinsson et al., 2017).

2. TT-UTV Algorithms and Procedures

In TT-UTV, each SVD in the TT-SVD sweep is replaced by a rank- $r$ truncated UTV factorization. Two canonical variants are distinguished by sweep direction and core structure:

TT-ULV: Left-to-right sweep yielding left-orthogonal TT cores. At step $k$ $k$ ,
1. Reshape the unfolding matrix $C$ to size $r_{k-1}I_k \times (I_{k+1}\cdots I_d)$ .
2. Compute truncated ULV: $C \approx \widehat{U}^{(k)} \widehat{L}^{(k)} \widehat{V}^{(k)\top}$ , with $\widehat{L}^{(k)}\in\mathbb{R}^{r_k\times r_k}$ .
3. Set the core $\mathcal{G}^{(k)} \leftarrow \mathrm{reshape}(\widehat{U}^{(k)}, [r_{k-1}, I_k, r_k])$ .
4. Form the next unfolding $C \leftarrow \widehat{L}^{(k)} \widehat{V}^{(k)\top}$ .
TT-URV: Right-to-left sweep yielding right-orthogonal TT cores by dual procedures with URV.

Randomized UTV algorithms, such as randUTV, are incorporated as the UTV engine. At each block step, randomized projections and QR factorizations iteratively isolate leading subspaces, with embedded SVDs for the final block diagonal, supporting early stopping once a global tolerance is reached (Martinsson et al., 2017).

When targeting a prescribed global error $\varepsilon$ (in Frobenius norm), local truncation tolerances in each UTV are set to $\delta = \varepsilon / (\sqrt{d-1} \|\mathcal{A}\|_F)$ (Wang et al., 14 Jan 2025). For variable local ranks, ranks are chosen to match the (approximate) spectral decay thresholds at each step.

3. Error Bounds and Theoretical Guarantees

Let $\varepsilon_k$ denote the local UTV-truncation error at the $k$ th unfolding. The resulting TT-UTV approximation $\widehat{\mathcal{A}}$ satisfies

$\|\mathcal{A} - \widehat{\mathcal{A}}\|_F \leq \sqrt{\sum_{k=1}^{d-1} \varepsilon_k^2}$

in the left-orthogonal (ULV) scheme, with an analogous bound for right-orthogonal (URV), matching the corresponding TT-SVD error bound

$\|\mathcal{A} - \widehat{\mathcal{A}}\|_F \leq \sqrt{\sum_k \sigma_{r_k+1}(\mathbf{A}_k)^2}$

when the UTV step is effectively rank-revealing ( $\varepsilon_k = O(\sigma_{r_k+1})$ ).

For randomized UTV (randUTV),

$\mathbb{E}\|A - A QQ^*\|_2 \leq [(\sigma_{b+1}^2 / \sigma_1^2)^q \sigma_{b+1}] \cdot C$

with $q$ power iterations and blocksize $b$ , nearly matching randomized SVD bounds.

Using the correct UTV sweep (ULV for left-to-right, URV for right-to-left) ensures decoupling between core orthogonality and the error in the tensor train, allowing for the sharp $L_2$ aggregation of per-step errors. Reversing sweeps degrades error estimation to a cruder sum-of-norms bound (Wang et al., 14 Jan 2025).

4. Computational Complexity and Parallelism

The memory and storage profile for TT-UTV is identical to TT-SVD: storing TT cores of size $(r_{k-1}\times I_k \times r_k)$ at each mode.

Flop Count: For mode $k$ , TT-SVD costs $O(r_{k-1}I_k(I_{>k})^2)$ per SVD (where $I_{>k}=I_{k+1}\cdots I_d$ ), whereas UTV-based methods (e.g., QR/randUTV) achieve the same up to a constant reduction (factor $\approx 1/2$ for classic UTV, $(5+2q)/2$ for randUTV).
Parallelism: UTV-based routines (especially randUTV) concentrate computation in matrix-matrix multiplies (BLAS-3), which optimally utilize multi-core and GPU resources. This leads to reduced wall-clock time compared to SVD-based methods, where a significant portion is spent in BLAS-2 operations or pivoting (Martinsson et al., 2017).

5. Empirical Results and Applications

Numerical experiments demonstrate that TT-UTV (both ULV and URV) yields the same exponential decay of error with respect to rank as TT-SVD for structured tensors (e.g., Hilbert tensor families). In applied settings:

Color Image Compression: For 4th-order RGB image tensors ( $324\times486\times3$ , ranks $\approx (1,15,45,25,1)$ ), both TT-SVD and TT-UTV (ULV/URV, including randUTV) achieve $\sim 5:1$ compression with indistinguishable reconstruction quality; relative squared error (RSE) curves nearly overlap.
MRI Tensor Completion: In Riemannian gradient-based tensor completion (RGrad), substituting TT-SVD with TT-ULV or TT-URV cores leaves convergence behavior and recovery accuracy unchanged. At 70% missing data, all methods yield RSE $\approx 0.10$ , PSNR $\approx 30\,$ dB in $\sim 20$ iterations (Wang et al., 14 Jan 2025).

The empirical evidence supports that TT-UTV is a computationally efficient, drop-in replacement for TT-SVD with near-identical functional performance across a range of problem domains (Wang et al., 14 Jan 2025, Martinsson et al., 2017).

6. Practical Guidelines and Recommendations

Core Orthogonality: Use TT-ULV (left-to-right sweep) for left-orthogonal cores, TT-URV (right-to-left sweep) for right-orthogonal; do not swap sweep directions for the given UTV type.
Error Targeting: For prescribed global approximation error $\varepsilon$ , set per-step tolerance as $\delta = \varepsilon / (\sqrt{d-1} \|\mathcal{A}\|_F)$ .
Rank Selection: Set stepwise ranks $r_k$ to achieve $\|\mathbf{E}^{(k)}\| \approx \sigma_{r_k+1}(\mathbf{A}_k)$ , aligning with singular value decay.
Algorithm Choice: Any mature rank-revealing UTV can be used: QR with column pivoting, QLP, or randomized approaches such as randUTV.
Limitations: The quality of TT-UTV depends on the rank-revealing fidelity of the UTV engine; poor pivoting or block choices can degrade approximation quality.

7. Summary Table: TT-SVD vs TT-UTV Approaches

Aspect	TT-SVD	TT-UTV (ULV/URV, randUTV)
Core Extraction	Full (truncated) SVD at each sweep step	Truncated UTV (rank-revealing triangular)
Parallelism	Moderate (BLAS-2 + BLAS-3)	High (BLAS-3 dominant)
Error Bounds	$\sqrt{\sum \sigma_{r_k+1}^2}$	$\sqrt{\sum \varepsilon_k^2}$ , matches SVD
Flexibility	SVD only	Any UTV: QR/QLP/randUTV
Run Time (wall)	Higher, esp. on multicore/GPU	Lower, especially with blocked/randUTV

TT-UTV is an efficient, structure-preserving alternative to TT-SVD, offering substantial computational savings and high parallel efficiency, while maintaining rigorous approximation quality through sharp error bounds, particularly when employing randomized UTV engines such as randUTV (Wang et al., 14 Jan 2025, Martinsson et al., 2017).