TEncDM: Understanding the Properties of the Diffusion Model in the Space of Language Model Encodings (2402.19097v3)

Published 29 Feb 2024 in cs.CL

Abstract: This paper presents the Text Encoding Diffusion Model (TEncDM), a novel approach to diffusion modeling that operates in the space of pre-trained LLM encodings. In contrast to traditionally used embeddings, encodings integrate contextual information. In our approach, we also employ a transformer-based decoder, specifically designed to incorporate context in the token prediction process. We conduct a comprehensive examination of the influence of the encoder, decoder, noise scheduler, and self-conditioning on zero-shot generation. Furthermore, we compare TEncDM with previous approaches on three conditional text generation tasks: QQP, XSum, and Wiki-Auto. The results show that TEncDM exhibits superior performance compared to existing non-autoregressive diffusion models. Our code is available at https://github.com/M0RJIQUE/tencdm.

References (42)

Citations (1)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

TEncDM: Understanding the Properties of the Diffusion Model in the Space of Language Model Encodings (2402.19097v3)

Summary

Related Papers

Tweets