Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
113 tokens/sec
GPT-4o
12 tokens/sec
Gemini 2.5 Pro Pro
36 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
3 tokens/sec
DeepSeek R1 via Azure Pro
33 tokens/sec
2000 character limit reached

Probabilistically Masked Language Model Capable of Autoregressive Generation in Arbitrary Word Order (2004.11579v1)

Published 24 Apr 2020 in cs.CL

Abstract: Masked LLM and autoregressive LLM are two types of LLMs. While pretrained masked LLMs such as BERT overwhelm the line of natural language understanding (NLU) tasks, autoregressive LLMs such as GPT are especially capable in natural language generation (NLG). In this paper, we propose a probabilistic masking scheme for the masked LLM, which we call probabilistically masked LLM (PMLM). We implement a specific PMLM with a uniform prior distribution on the masking ratio named u-PMLM. We prove that u-PMLM is equivalent to an autoregressive permutated LLM. One main advantage of the model is that it supports text generation in arbitrary order with surprisingly good quality, which could potentially enable new applications over traditional unidirectional generation. Besides, the pretrained u-PMLM also outperforms BERT on a set of downstream NLU tasks.

Citations (37)

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

Authors (3)