Ternary Neural Networks for Resource-Efficient AI Applications (1609.00222v2)

Published 1 Sep 2016 in cs.LG, cs.AI, and cs.NE

Abstract: The computation and storage requirements for Deep Neural Networks (DNNs) are usually high. This issue limits their deployability on ubiquitous computing devices such as smart phones, wearables and autonomous drones. In this paper, we propose ternary neural networks (TNNs) in order to make deep learning more resource-efficient. We train these TNNs using a teacher-student approach based on a novel, layer-wise greedy methodology. Thanks to our two-stage training procedure, the teacher network is still able to use state-of-the-art methods such as dropout and batch normalization to increase accuracy and reduce training time. Using only ternary weights and activations, the student ternary network learns to mimic the behavior of its teacher network without using any multiplication. Unlike its -1,1 binary counterparts, a ternary neural network inherently prunes the smaller weights by setting them to zero during training. This makes them sparser and thus more energy-efficient. We design a purpose-built hardware architecture for TNNs and implement it on FPGA and ASIC. We evaluate TNNs on several benchmark datasets and demonstrate up to 3.1x better energy efficiency with respect to the state of the art while also improving accuracy.

Citations (199)

View on Semantic Scholar

Summary

The paper presents a teacher-student training framework with a layer-wise greedy procedure that converts full-precision weights to ternary values, reducing computation.
It achieves up to 3.1× energy efficiency, a 2.7× throughput boost, and 635× area efficiency on FPGA/ASIC platforms while maintaining 87.89% accuracy on CIFAR-10.
The study demonstrates the potential of Ternary Neural Networks for low-power, real-time AI deployment on resource-constrained edge devices.

Overview of Ternary Neural Networks for Resource-Efficient AI Applications

This paper presents a novel approach to enhancing resource efficiency in deep learning applications through the introduction of Ternary Neural Networks (TNNs). Given the growing complexity and resource demands of Deep Neural Networks (DNNs), this work offers a viable strategy to reduce computation and storage requirements, facilitating deployment on low-power devices like smartphones and wearables.

The authors employ a two-stage teacher-student training methodology, with an innovative layer-wise greedy procedure for TNNs. Unlike previous models relying on {-1,1} binary weights, TNNs incorporate zero-valued ternary weights, effectively pruning redundant weights and increasing sparseness. The elimination of multiplication operations through the use of ternary weights and activations ({-1,0,1}) significantly enhances energy efficiency and throughput without compromising accuracy.

Key contributions of this paper include:

A teacher-student framework for training TNNs that leverages state-of-the-art training methods, such as dropout and batch normalization, to preserve model performance.
Development of a specialized hardware architecture, enabling notable energy, area, and throughput efficiencies.

Experiments conducted on multiple benchmark datasets demonstrate up to a 3.1 $\times$ improvement in energy efficiency compared to existing methods, alongside competitive accuracy results. The hardware designed for TNNs, implemented on FPGA and ASIC platforms, further underscores these gains, with detailed performance metrics provided.

Numerical Results and Claims

The presented TNN approach shows a decrease in energy consumption by 3.1 $\times$ , and a throughput boost of up to 2.7 $\times$ , compared to existing solutions like the TrueNorth system. Additionally, area efficiency is reported to be up to 635 $\times$ more effective. Such significant improvements emphasize the potential for TNNs to transform the deployment of neural networks on resource-constrained devices.

The experimental evaluations report a classification accuracy of 87.89% on the CIFAR-10 dataset, which is higher than several contemporary methods, solidifying the capability of TNNs in practical applications. With an innovation that retains competitive accuracy while substantially reducing resource demands, TNNs position themselves as a robust solution for edge computing and energy-sensitive applications.

Implications and Future Prospects

The implications of the efficient TNN framework extend to various domains where AI deployment is typically constrained by power and processing capabilities. Enhancements in energy and throughput efficiency may encourage the development and integration of AI models into new market segments, particularly those focused on IoT devices and real-time data processing with limited hardware support.

Future developments could explore further optimization in the training process or adapt the framework to support other neural architectures beyond CNNs and MLPs. Additionally, while the hardware advancements provide significant efficiency gains, further integration into commercial-grade AI solutions might necessitate adjustments to support broader functionality or interoperability with existing systems.

In conclusion, this paper significantly contributes to resource-efficient AI through the introduction of TNNs, offering a scalable and practical solution that maintains high performance while adhering to the constraints of low-power environments.

PDF Markdown

Related Papers

Tweets

https://twitter.com/Akabeko1001RS/status/1797319791059083459