QuadraNet: Improving High-Order Neural Interaction Efficiency with Hardware-Aware Quadratic Neural Networks (2311.17956v1)

Published 29 Nov 2023 in cs.LG, cs.CV, and cs.NE

Abstract: Recent progress in computer vision-oriented neural network designs is mostly driven by capturing high-order neural interactions among inputs and features. And there emerged a variety of approaches to accomplish this, such as Transformers and its variants. However, these interactions generate a large amount of intermediate state and/or strong data dependency, leading to considerable memory consumption and computing cost, and therefore compromising the overall runtime performance. To address this challenge, we rethink the high-order interactive neural network design with a quadratic computing approach. Specifically, we propose QuadraNet -- a comprehensive model design methodology from neuron reconstruction to structural block and eventually to the overall neural network implementation. Leveraging quadratic neurons' intrinsic high-order advantages and dedicated computation optimization schemes, QuadraNet could effectively achieve optimal cognition and computation performance. Incorporating state-of-the-art hardware-aware neural architecture search and system integration techniques, QuadraNet could also be well generalized in different hardware constraint settings and deployment scenarios. The experiment shows thatQuadraNet achieves up to 1.5$\times$ throughput, 30% less memory footprint, and similar cognition performance, compared with the state-of-the-art high-order approaches.

References (25)

Citations (3)

View on Semantic Scholar

Summary

The paper introduces quadratic neurons and QuadraBlock to efficiently capture high-order interactions while reducing memory overhead.
The paper demonstrates a hardware-aware neural architecture search that tailors the model for various devices, improving throughput and latency.
The paper’s ablation study confirms that factorizing quadratic computations maintains competitive accuracy with a reduced parameter space.

Introduction to the Quadratic Neural Network

The field of computer vision has been revolutionized by neural networks that focus on high-order interactions among inputs and features, with the introduction of models like Transformers. However, these advanced mechanisms bring significant computational costs, leading to issues like high memory consumption and slow performance, especially as seen in self-attention mechanisms. This paper introduces QuadraNet, a novel architecture that embeds high-order interactions at the neuron level with a quadratic computing approach, aiming to effectively balance cognition performance with computational efficiency.

A New Approach to High-Order Neuronal Interaction

The core innovation of QuadraNet lies in its 'quadratic neurons', which inherently possess an advantage for high-order information processing. By simplifying computation patterns, these neurons are able to self-reinforce information across multiple feature dimensions, thereby bypassing the generation of large intermediate states, which is a common issue in conventional neural networks utilizing self-attention.

QuadraNet's design methodology starts from individual quadratic neurons, which through factorization reduce parameter space and computational complexity significantly. These neurons are then mapped onto a convolution operation to achieve efficient high-order interactions without accruing massive intermediate state spaces or creating recursive data dependencies.

QuadraBlock: The Building Block of QuadraNet

Advancements in architecture are encapsulated in what is called 'QuadraBlock'. This block integrates quadratic neurons into a cohesive structure that performs spatial neural interaction and manages information transformation across channels. QuadraBlock is designed to retain computation efficiency while ensuring that the cognitive capabilities of quadratic neurons are maximized, such as approximating a high-rank quadratic weight matrix within the block to improve overall cognition capacity.

Furthermore, QuadraNet is smartly constructed using a pyramid architectural approach, standard for modern CNNs and Transformer-like models. This provides multiple stages of neural processing with increasing channels and decreasing resolutions, essential for capturing a hierarchy of features from simple to complex.

Hardware-Aware Optimization and Generalization

One of the challenges in deploying AI models to various devices is the diverse hardware constraints that can greatly impact performance. To address this, the researchers extend QuadraNet’s design methodology with a neural architecture search (NAS) that is hardware-aware. Through this approach, the model's architecture is tailored to the specific computational constraints of different hardware, optimizing not just for accuracy, but also for performance metrics such as latency and memory footprint.

The effectiveness of QuadraNet, when benchmarked against other state-of-the-art models, demonstrates its superiority in terms of throughput, memory consumption, and maintaining competitive accuracy. The paper concludes with an ablation paper, further evaluating aspects of QuadraNet such as the benefits of quadratic neurons and the impact of receptive fields, leading to principles that guide the efficient design within high-order interaction spaces.

In summary, QuadraNet paves a way toward more efficient neural network designs that can effectively grasp the complexity of high-order interactions while being sensitive to the operational needs of varying computational environments.