Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Partial Tensorized Transformers for Natural Language Processing (2310.20077v1)

Published 30 Oct 2023 in cs.CL and cs.LG

Abstract: The transformer architecture has revolutionized NLP and other machine-learning tasks, due to its unprecedented accuracy. However, their extensive memory and parameter requirements often hinder their practical applications. In this work, we study the effect of tensor-train decomposition to improve the accuracy and compress transformer vision-language neural networks, namely BERT and ViT. We focus both on embedding-layer compression and partial tensorization of neural networks (PTNN) through an algorithmic approach. Our novel PTNN approach significantly improves the accuracy of existing models by up to 5%, all without the need for post-training adjustments, breaking new ground in the field of tensor decomposition.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Subhadra Vadlamannati (1 paper)
  2. Ryan Solgi (4 papers)

Summary

We haven't generated a summary for this paper yet.