Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Prompting ELECTRA: Few-Shot Learning with Discriminative Pre-Trained Models (2205.15223v3)

Published 30 May 2022 in cs.CL and cs.LG

Abstract: Pre-trained masked LLMs successfully perform few-shot learning by formulating downstream tasks as text infilling. However, as a strong alternative in full-shot settings, discriminative pre-trained models like ELECTRA do not fit into the paradigm. In this work, we adapt prompt-based few-shot learning to ELECTRA and show that it outperforms masked LLMs in a wide range of tasks. ELECTRA is pre-trained to distinguish if a token is generated or original. We naturally extend that to prompt-based few-shot learning by training to score the originality of the target options without introducing new parameters. Our method can be easily adapted to tasks involving multi-token predictions without extra computation overhead. Analysis shows that ELECTRA learns distributions that align better with downstream tasks.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Mengzhou Xia (34 papers)
  2. Mikel Artetxe (52 papers)
  3. Jingfei Du (16 papers)
  4. Danqi Chen (84 papers)
  5. Ves Stoyanov (15 papers)
Citations (4)

Summary

We haven't generated a summary for this paper yet.