A Practical Mode-parallel Implementation of the (H-)Tucker Decomposition via Randomization

Published 22 Mar 2026 in math.NA | (2603.21379v1)

Abstract: In the last decades, tensors have emerged as the right tool to represent multidimensional data in a compact yet informative manner. Moreover, it is well-known that by performing low-rank factorizations of such tensors one is often able to effectively unveil possible hidden structure in data, mainly due to unexpected dependencies among the different variables encoded in the given tensor. However, computing these factorizations is extremely energy-consuming and memory-demanding, especially for high-dimensional tensors, namely those with a large number of modes. In this paper we focus on two state-of-the-art tensor decompositions: the Tucker and H-Tucker decompositions. We propose novel numerical strategies able to perform these factorizations in a \emph{mode-parallel} fashion, that is the operations required by the algorithm along all modes are performed in parallel. This is in contrast to what is achieved by many procedures available in the literature that parallelize some of the operations along each mode, e.g., tensor-times-matrix steps, while still visiting one mode at the time in a sequential manner. Our strategies make use of cutting-edge randomization techniques comprising fiber sampling and randomized range-finding steps. We provide upper bounds on the expected value of the error provided by our factorizations while a panel of numerical results showcases the potential of our approach in reducing both the running time and the storage demand of the whole procedure. Moreover, experiments carried out in HPC environments illustrate the good scaling of our mode-parallel approach.