Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
60 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Sparse*BERT: Sparse Models Generalize To New tasks and Domains (2205.12452v3)

Published 25 May 2022 in cs.CL and cs.AI

Abstract: LLMs have become the core architecture upon which most modern NLP systems build. These models can consistently deliver impressive accuracy and robustness across tasks and domains, but their high computational overhead can make inference difficult and expensive. To make using these models less costly, recent work has explored leveraging structured and unstructured pruning, quantization, and distillation to improve inference speed and decrease size. This paper studies how models pruned using Gradual Unstructured Magnitude Pruning can transfer between domains and tasks. Our experimentation shows that models that are pruned during pretraining using general domain masked LLMs can transfer to novel domains and tasks without extensive hyperparameter exploration or specialized approaches. We demonstrate that our general sparse model Sparse*BERT can become SparseBioBERT simply by pretraining the compressed architecture on unstructured biomedical text. Moreover, we show that SparseBioBERT can match the quality of BioBERT with only 10\% of the parameters.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Daniel Campos (62 papers)
  2. Alexandre Marques (6 papers)
  3. Tuan Nguyen (41 papers)
  4. Mark Kurtz (6 papers)
  5. ChengXiang Zhai (64 papers)
Citations (1)