Papers
Topics
Authors
Recent
2000 character limit reached

UTV-based TT Decomposition

Updated 28 November 2025
  • UTV-based TT Decomposition is a method that replaces traditional SVD with rank-revealing UTV factorizations to construct tensor trains with reduced computational cost.
  • It leverages both deterministic (ULV/URV) and randomized (randUTV) algorithms to maintain approximation quality while enhancing parallelism and facilitating early stopping.
  • Practical applications include color image compression and MRI tensor completion, achieving similar reconstruction quality to TT-SVD with significantly lower runtime.

UTV-based TT (Tensor Train) decomposition denotes a class of algorithms for computing tensor train representations of high-dimensional tensors by systematically replacing the SVD steps of traditional TT-SVD with (possibly randomized) rank-revealing UTV matrix factorizations. The key insight is that triangular factors in UTV can provide the necessary rank truncation and orthogonality properties at a fraction of the computational cost, particularly benefitting from superior parallelizability and early stopping features, while retaining essentially the same approximation quality as TT-SVD. This approach encompasses both deterministic (ULV/URV) and randomized (randUTV) factorizations, supporting both left- and right-orthogonal TT core extraction and flexible error control (Wang et al., 14 Jan 2025, Martinsson et al., 2017).

1. Tensor Train Decomposition and UTV Preliminaries

Let ARI1××Id\mathcal{A}\in\mathbb{R}^{I_1 \times \cdots \times I_d} be a dd-way tensor. The tensor train decomposition writes

Ai1id=α0,,αdGα0,i1,α1(1)Gαd1,id,αd(d),\mathcal{A}_{i_1\cdots i_d} = \sum_{\alpha_0,\dots,\alpha_d} \mathcal{G}^{(1)}_{\alpha_0,i_1,\alpha_1} \cdots \mathcal{G}^{(d)}_{\alpha_{d-1},i_d,\alpha_d},

where α0=αd=1\alpha_0 = \alpha_d = 1, and G(k)Rrk1×Ik×rk\mathcal{G}^{(k)}\in\mathbb{R}^{r_{k-1}\times I_k\times r_k} are the TT cores.

The classical TT-SVD computes these cores via a sequence of SVDs on unfolding matrices. At each step, the tensor is reshaped and the dominant singular vectors computed, extracting a core and progressing to the next mode.

The UTV decomposition of a matrix ARm×nA\in\mathbb{R}^{m\times n} is

A=UTVA = UTV^{\top}

where URm×mU\in\mathbb{R}^{m\times m} and VRn×nV\in\mathbb{R}^{n\times n} are orthonormal, and TT is triangular (upper- or lower-triangular: URV/ULV). For truncated or blockwise UTV—especially using randomized variants (e.g., randUTV)—the factorization can rapidly reveal approximate ranks and principal subspaces without full SVD overhead (Martinsson et al., 2017).

2. TT-UTV Algorithms and Procedures

In TT-UTV, each SVD in the TT-SVD sweep is replaced by a rank-rr truncated UTV factorization. Two canonical variants are distinguished by sweep direction and core structure:

  • TT-ULV: Left-to-right sweep yielding left-orthogonal TT cores. At step kk,

    1. Reshape the unfolding matrix CC to size rk1Ik×(Ik+1Id)r_{k-1}I_k \times (I_{k+1}\cdots I_d).
    2. Compute truncated ULV: CU^(k)L^(k)V^(k)C \approx \widehat{U}^{(k)} \widehat{L}^{(k)} \widehat{V}^{(k)\top}, with L^(k)Rrk×rk\widehat{L}^{(k)}\in\mathbb{R}^{r_k\times r_k}.
    3. Set the core G(k)reshape(U^(k),[rk1,Ik,rk])\mathcal{G}^{(k)} \leftarrow \mathrm{reshape}(\widehat{U}^{(k)}, [r_{k-1}, I_k, r_k]).
    4. Form the next unfolding CL^(k)V^(k)C \leftarrow \widehat{L}^{(k)} \widehat{V}^{(k)\top}.
  • TT-URV: Right-to-left sweep yielding right-orthogonal TT cores by dual procedures with URV.

Randomized UTV algorithms, such as randUTV, are incorporated as the UTV engine. At each block step, randomized projections and QR factorizations iteratively isolate leading subspaces, with embedded SVDs for the final block diagonal, supporting early stopping once a global tolerance is reached (Martinsson et al., 2017).

When targeting a prescribed global error ε\varepsilon (in Frobenius norm), local truncation tolerances in each UTV are set to δ=ε/(d1AF)\delta = \varepsilon / (\sqrt{d-1} \|\mathcal{A}\|_F) (Wang et al., 14 Jan 2025). For variable local ranks, ranks are chosen to match the (approximate) spectral decay thresholds at each step.

3. Error Bounds and Theoretical Guarantees

Let εk\varepsilon_k denote the local UTV-truncation error at the kkth unfolding. The resulting TT-UTV approximation A^\widehat{\mathcal{A}} satisfies

AA^Fk=1d1εk2\|\mathcal{A} - \widehat{\mathcal{A}}\|_F \leq \sqrt{\sum_{k=1}^{d-1} \varepsilon_k^2}

in the left-orthogonal (ULV) scheme, with an analogous bound for right-orthogonal (URV), matching the corresponding TT-SVD error bound

AA^Fkσrk+1(Ak)2\|\mathcal{A} - \widehat{\mathcal{A}}\|_F \leq \sqrt{\sum_k \sigma_{r_k+1}(\mathbf{A}_k)^2}

when the UTV step is effectively rank-revealing (εk=O(σrk+1)\varepsilon_k = O(\sigma_{r_k+1})).

For randomized UTV (randUTV),

EAAQQ2[(σb+12/σ12)qσb+1]C\mathbb{E}\|A - A QQ^*\|_2 \leq [(\sigma_{b+1}^2 / \sigma_1^2)^q \sigma_{b+1}] \cdot C

with qq power iterations and blocksize bb, nearly matching randomized SVD bounds.

Using the correct UTV sweep (ULV for left-to-right, URV for right-to-left) ensures decoupling between core orthogonality and the error in the tensor train, allowing for the sharp L2L_2 aggregation of per-step errors. Reversing sweeps degrades error estimation to a cruder sum-of-norms bound (Wang et al., 14 Jan 2025).

4. Computational Complexity and Parallelism

The memory and storage profile for TT-UTV is identical to TT-SVD: storing TT cores of size (rk1×Ik×rk)(r_{k-1}\times I_k \times r_k) at each mode.

  • Flop Count: For mode kk, TT-SVD costs O(rk1Ik(I>k)2)O(r_{k-1}I_k(I_{>k})^2) per SVD (where I>k=Ik+1IdI_{>k}=I_{k+1}\cdots I_d), whereas UTV-based methods (e.g., QR/randUTV) achieve the same up to a constant reduction (factor 1/2\approx 1/2 for classic UTV, (5+2q)/2(5+2q)/2 for randUTV).
  • Parallelism: UTV-based routines (especially randUTV) concentrate computation in matrix-matrix multiplies (BLAS-3), which optimally utilize multi-core and GPU resources. This leads to reduced wall-clock time compared to SVD-based methods, where a significant portion is spent in BLAS-2 operations or pivoting (Martinsson et al., 2017).

5. Empirical Results and Applications

Numerical experiments demonstrate that TT-UTV (both ULV and URV) yields the same exponential decay of error with respect to rank as TT-SVD for structured tensors (e.g., Hilbert tensor families). In applied settings:

  • Color Image Compression: For 4th-order RGB image tensors (324×486×3324\times486\times3, ranks (1,15,45,25,1)\approx (1,15,45,25,1)), both TT-SVD and TT-UTV (ULV/URV, including randUTV) achieve 5:1\sim 5:1 compression with indistinguishable reconstruction quality; relative squared error (RSE) curves nearly overlap.
  • MRI Tensor Completion: In Riemannian gradient-based tensor completion (RGrad), substituting TT-SVD with TT-ULV or TT-URV cores leaves convergence behavior and recovery accuracy unchanged. At 70% missing data, all methods yield RSE 0.10\approx 0.10, PSNR 30\approx 30\,dB in 20\sim 20 iterations (Wang et al., 14 Jan 2025).

The empirical evidence supports that TT-UTV is a computationally efficient, drop-in replacement for TT-SVD with near-identical functional performance across a range of problem domains (Wang et al., 14 Jan 2025, Martinsson et al., 2017).

6. Practical Guidelines and Recommendations

  • Core Orthogonality: Use TT-ULV (left-to-right sweep) for left-orthogonal cores, TT-URV (right-to-left sweep) for right-orthogonal; do not swap sweep directions for the given UTV type.
  • Error Targeting: For prescribed global approximation error ε\varepsilon, set per-step tolerance as δ=ε/(d1AF)\delta = \varepsilon / (\sqrt{d-1} \|\mathcal{A}\|_F).
  • Rank Selection: Set stepwise ranks rkr_k to achieve E(k)σrk+1(Ak)\|\mathbf{E}^{(k)}\| \approx \sigma_{r_k+1}(\mathbf{A}_k), aligning with singular value decay.
  • Algorithm Choice: Any mature rank-revealing UTV can be used: QR with column pivoting, QLP, or randomized approaches such as randUTV.
  • Limitations: The quality of TT-UTV depends on the rank-revealing fidelity of the UTV engine; poor pivoting or block choices can degrade approximation quality.

7. Summary Table: TT-SVD vs TT-UTV Approaches

Aspect TT-SVD TT-UTV (ULV/URV, randUTV)
Core Extraction Full (truncated) SVD at each sweep step Truncated UTV (rank-revealing triangular)
Parallelism Moderate (BLAS-2 + BLAS-3) High (BLAS-3 dominant)
Error Bounds σrk+12\sqrt{\sum \sigma_{r_k+1}^2} εk2\sqrt{\sum \varepsilon_k^2}, matches SVD
Flexibility SVD only Any UTV: QR/QLP/randUTV
Run Time (wall) Higher, esp. on multicore/GPU Lower, especially with blocked/randUTV

TT-UTV is an efficient, structure-preserving alternative to TT-SVD, offering substantial computational savings and high parallel efficiency, while maintaining rigorous approximation quality through sharp error bounds, particularly when employing randomized UTV engines such as randUTV (Wang et al., 14 Jan 2025, Martinsson et al., 2017).

Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to UTV-based TT Decomposition.