Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

BiBERT: Accurate Fully Binarized BERT (2203.06390v1)

Published 12 Mar 2022 in cs.CL

Abstract: The large pre-trained BERT has achieved remarkable performance on NLP tasks but is also computation and memory expensive. As one of the powerful compression approaches, binarization extremely reduces the computation and memory consumption by utilizing 1-bit parameters and bitwise operations. Unfortunately, the full binarization of BERT (i.e., 1-bit weight, embedding, and activation) usually suffer a significant performance drop, and there is rare study addressing this problem. In this paper, with the theoretical justification and empirical analysis, we identify that the severe performance drop can be mainly attributed to the information degradation and optimization direction mismatch respectively in the forward and backward propagation, and propose BiBERT, an accurate fully binarized BERT, to eliminate the performance bottlenecks. Specifically, BiBERT introduces an efficient Bi-Attention structure for maximizing representation information statistically and a Direction-Matching Distillation (DMD) scheme to optimize the full binarized BERT accurately. Extensive experiments show that BiBERT outperforms both the straightforward baseline and existing state-of-the-art quantized BERTs with ultra-low bit activations by convincing margins on the NLP benchmark. As the first fully binarized BERT, our method yields impressive 56.3 times and 31.2 times saving on FLOPs and model size, demonstrating the vast advantages and potential of the fully binarized BERT model in real-world resource-constrained scenarios.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Haotong Qin (60 papers)
  2. Yifu Ding (28 papers)
  3. Mingyuan Zhang (41 papers)
  4. Qinghua Yan (3 papers)
  5. Aishan Liu (72 papers)
  6. Qingqing Dang (15 papers)
  7. Ziwei Liu (368 papers)
  8. Xianglong Liu (128 papers)
Citations (78)

Summary

We haven't generated a summary for this paper yet.