Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
126 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

cuFasterTucker: A Stochastic Optimization Strategy for Parallel Sparse FastTucker Decomposition on GPU Platform (2210.06014v1)

Published 12 Oct 2022 in cs.DC

Abstract: Currently, the size of scientific data is growing at an unprecedented rate. Data in the form of tensors exhibit high-order, high-dimensional, and highly sparse features. Although tensor-based analysis methods are very effective, the large increase in data size makes the original tensor impossible to process. Tensor decomposition decomposes a tensor into multiple low-rank matrices or tensors that can be exploited by tensor-based analysis methods. Tucker decomposition is such an algorithm, which decomposes a $n$-order tensor into $n$ low-rank factor matrices and a low-rank core tensor. However, most Tucker decomposition methods are accompanied by huge intermediate variables and huge computational load, making them unable to process high-order and high-dimensional tensors. In this paper, we propose FasterTucker decomposition based on FastTucker decomposition, which is a variant of Tucker decomposition. And an efficient parallel FasterTucker decomposition algorithm cuFasterTucker on GPU platform is proposed. It has very low storage and computational requirements, and effectively solves the problem of high-order and high-dimensional sparse tensor decomposition. Compared with the state-of-the-art algorithm, it achieves a speedup of around $15X$ and $7X$ in updating the factor matrices and updating the core matrices, respectively.

Summary

We haven't generated a summary for this paper yet.