Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Improving the Diversity of Unsupervised Paraphrasing with Embedding Outputs (2110.13231v1)

Published 25 Oct 2021 in cs.CL

Abstract: We present a novel technique for zero-shot paraphrase generation. The key contribution is an end-to-end multilingual paraphrasing model that is trained using translated parallel corpora to generate paraphrases into "meaning spaces" -- replacing the final softmax layer with word embeddings. This architectural modification, plus a training procedure that incorporates an autoencoding objective, enables effective parameter sharing across languages for more fluent monolingual rewriting, and facilitates fluency and diversity in generation. Our continuous-output paraphrase generation models outperform zero-shot paraphrasing baselines when evaluated on two languages using a battery of computational metrics as well as in human assessment.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Monisha Jegadeesan (1 paper)
  2. Sachin Kumar (68 papers)
  3. John Wieting (40 papers)
  4. Yulia Tsvetkov (142 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.