Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Probabilistically Masked Language Model Capable of Autoregressive Generation in Arbitrary Word Order (2004.11579v1)

Published 24 Apr 2020 in cs.CL

Abstract: Masked LLM and autoregressive LLM are two types of LLMs. While pretrained masked LLMs such as BERT overwhelm the line of natural language understanding (NLU) tasks, autoregressive LLMs such as GPT are especially capable in natural language generation (NLG). In this paper, we propose a probabilistic masking scheme for the masked LLM, which we call probabilistically masked LLM (PMLM). We implement a specific PMLM with a uniform prior distribution on the masking ratio named u-PMLM. We prove that u-PMLM is equivalent to an autoregressive permutated LLM. One main advantage of the model is that it supports text generation in arbitrary order with surprisingly good quality, which could potentially enable new applications over traditional unidirectional generation. Besides, the pretrained u-PMLM also outperforms BERT on a set of downstream NLU tasks.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Yi Liao (87 papers)
  2. Xin Jiang (242 papers)
  3. Qun Liu (230 papers)
Citations (37)

Summary

We haven't generated a summary for this paper yet.