Papers
Topics
Authors
Recent
Search
2000 character limit reached

Hardware-Efficient Mixed-Precision CP Tensor Decomposition

Published 8 Sep 2022 in math.OC | (2209.04003v1)

Abstract: Tensor decomposition has been widely used in machine learning and high-volume data analysis. However, large-scale tensor factorization often consumes huge memory and computing cost. Meanwhile, modernized computing hardware such as tensor processing units (TPU) and Tensor Core GPU has opened a new window of hardware-efficient computing via mixed- or low-precision arithmetic representations. In this paper, we exploit the low-precision representation of tensor factorization, and propose a mixed-precision block stochastic gradient descent (SGD) method to reduce the costs of CP tensor decomposition. Our method achieves robust and fast convergence via a two-stage optimization, i.e., SignSGD followed by mixed-precision SGD. Detailed theoretical analysis is provided to prove the convergence of the proposed mixed-precision algorithm. Numerical experiments on both synthetic and realistic tensor data sets show the superior efficiency of our mixed-precision algorithm compared to full-precision CP decomposition. This work can remarkably reduce the memory, computing and energy cost on resource-constraint edge computing devices. We demonstrate this benefit via an FPGA prototype.

Authors (3)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.