Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Residual Energy-Based Models for Text (2004.10188v2)

Published 6 Apr 2020 in cs.CL, cs.LG, and stat.ML

Abstract: Current large-scale auto-regressive LLMs display impressive fluency and can generate convincing text. In this work we start by asking the question: Can the generations of these models be reliably distinguished from real text by statistical discriminators? We find experimentally that the answer is affirmative when we have access to the training data for the model, and guardedly affirmative even if we do not. This suggests that the auto-regressive models can be improved by incorporating the (globally normalized) discriminators into the generative process. We give a formalism for this using the Energy-Based Model framework, and show that it indeed improves the results of the generative models, measured both in terms of perplexity and in terms of human evaluation.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Anton Bakhtin (16 papers)
  2. Yuntian Deng (44 papers)
  3. Sam Gross (9 papers)
  4. Myle Ott (33 papers)
  5. Marc'Aurelio Ranzato (53 papers)
  6. Arthur Szlam (86 papers)
Citations (13)