Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

NLP From Scratch Without Large-Scale Pretraining: A Simple and Efficient Framework (2111.04130v2)

Published 7 Nov 2021 in cs.CL and cs.LG

Abstract: Pretrained LLMs have become the standard approach for many NLP tasks due to strong performance, but they are very expensive to train. We propose a simple and efficient learning framework, TLM, that does not rely on large-scale pretraining. Given some labeled task data and a large general corpus, TLM uses task data as queries to retrieve a tiny subset of the general corpus and jointly optimizes the task objective and the LLMing objective from scratch. On eight classification datasets in four domains, TLM achieves results better than or similar to pretrained LLMs (e.g., RoBERTa-Large) while reducing the training FLOPs by two orders of magnitude. With high accuracy and efficiency, we hope TLM will contribute to democratizing NLP and expediting its development.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Xingcheng Yao (10 papers)
  2. Yanan Zheng (13 papers)
  3. Xiaocong Yang (6 papers)
  4. Zhilin Yang (50 papers)
Citations (40)