Deep Tensor Network (2311.11091v2)
Abstract: We introduce the Deep Tensor Network, a novel framework that integrates tensor-based operations into the attention mechanism, thereby enhancing both the expressivity and computational efficiency of deep neural networks. Our approach leverages the algebraic structure of tensor products to generalize the conventional dot-product attention and to formulate new operators, namely, Tensor Attention and Tensor Interaction, which capture higher-order token dependencies. Through rigorous theoretical analysis based on the universal properties of tensor products, we demonstrate that our framework not only improves efficiency by reducing computational complexity but also offers a principled method for modeling complex interactions in sequential data. Empirical evaluations further substantiate that the proposed deep tensor network can serve as a robust building block for advancing state-of-the-art performance in various deep learning tasks.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.