Probabilistically Masked Language Model Capable of Autoregressive Generation in Arbitrary Word Order (2004.11579v1)

Published 24 Apr 2020 in cs.CL

Abstract: Masked LLM and autoregressive LLM are two types of LLMs. While pretrained masked LLMs such as BERT overwhelm the line of natural language understanding (NLU) tasks, autoregressive LLMs such as GPT are especially capable in natural language generation (NLG). In this paper, we propose a probabilistic masking scheme for the masked LLM, which we call probabilistically masked LLM (PMLM). We implement a specific PMLM with a uniform prior distribution on the masking ratio named u-PMLM. We prove that u-PMLM is equivalent to an autoregressive permutated LLM. One main advantage of the model is that it supports text generation in arbitrary order with surprisingly good quality, which could potentially enable new applications over traditional unidirectional generation. Besides, the pretrained u-PMLM also outperforms BERT on a set of downstream NLU tasks.

View on arXiv

Authors (3)

Yi Liao (87 papers)
Xin Jiang (242 papers)
Qun Liu (230 papers)

Citations (37)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Probabilistically Masked Language Model Capable of Autoregressive Generation in Arbitrary Word Order (2004.11579v1)

Summary

Related Papers