Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A Computing Kernel for Network Binarization on PyTorch (1911.04477v1)

Published 11 Nov 2019 in cs.LG and cs.NE

Abstract: Deep Neural Networks have now achieved state-of-the-art results in a wide range of tasks including image classification, object detection and so on. However, they are both computation consuming and memory intensive, making them difficult to deploy on low-power devices. Network binarization is one of the existing effective techniques for model compression and acceleration, but there is no computing kernel yet to support it on PyTorch. In this paper we developed a computing kernel supporting 1-bit xnor and bitcount computation on PyTorch. Experimental results show that our kernel could accelerate the inference of the binarized neural network by 3 times in GPU and by 4.5 times in CPU compared with the control group.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Xianda Xu (1 paper)
  2. Marco Pedersoli (81 papers)
Citations (3)

Summary

We haven't generated a summary for this paper yet.