Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
129 tokens/sec
GPT-4o
28 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Signed Binary Weight Networks (2211.13838v3)

Published 25 Nov 2022 in cs.CV, cs.DC, and cs.PF

Abstract: Efficient inference of Deep Neural Networks (DNNs) is essential to making AI ubiquitous. Two important algorithmic techniques have shown promise for enabling efficient inference - sparsity and binarization. These techniques translate into weight sparsity and weight repetition at the hardware-software level enabling the deployment of DNNs with critically low power and latency requirements. We propose a new method called signed-binary networks to improve efficiency further (by exploiting both weight sparsity and weight repetition together) while maintaining similar accuracy. Our method achieves comparable accuracy on ImageNet and CIFAR10 datasets with binary and can lead to 69% sparsity. We observe real speedup when deploying these models on general-purpose devices and show that this high percentage of unstructured sparsity can lead to a further reduction in energy consumption on ASICs.

Citations (1)

Summary

We haven't generated a summary for this paper yet.