Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

CEV-LM: Controlled Edit Vector Language Model for Shaping Natural Language Generations (2402.14290v1)

Published 22 Feb 2024 in cs.CL and cs.LG

Abstract: As large-scale LLMs become the standard for text generation, there is a greater need to tailor the generations to be more or less concise, targeted, and informative, depending on the audience/application. Existing control approaches primarily adjust the semantic (e.g., emotion, topics), structural (e.g., syntax tree, parts-of-speech), and lexical (e.g., keyword/phrase inclusion) properties of text, but are insufficient to accomplish complex objectives such as pacing which control the complexity and readability of the text. In this paper, we introduce CEV-LM - a lightweight, semi-autoregressive LLM that utilizes constrained edit vectors to control three complementary metrics (speed, volume, and circuitousness) that quantify the shape of text (e.g., pacing of content). We study an extensive set of state-of-the-art CTG models and find that CEV-LM provides significantly more targeted and precise control of these three metrics while preserving semantic content, using less training data, and containing fewer parameters.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (41)
  1. Longformer: The long-document transformer. arXiv preprint arXiv:2004.05150.
  2. On the dangers of stochastic parrots: Can language models be too big? In Proceedings of the 2021 ACM conference on fairness, accountability, and transparency, pages 610–623.
  3. Steven Bird and Edward Loper. 2004. NLTK: The natural language toolkit. In Proceedings of the ACL Interactive Poster and Demonstration Sessions, pages 214–217, Barcelona, Spain. Association for Computational Linguistics.
  4. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901.
  5. Eric Chu and Peter Liu. 2019. Meansum: A neural model for unsupervised multi-document abstractive summarization. In International Conference on Machine Learning, pages 1223–1232. PMLR.
  6. Plug and play language models: A simple approach to controlled text generation. In International Conference on Learning Representations.
  7. RealToxicityPrompts: Evaluating neural toxic degeneration in language models. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 3356–3369, Online. Association for Computational Linguistics.
  8. Clustered model adaption for personalized sentiment analysis. In Proceedings of the 26th International Conference on World Wide Web, WWW ’17, page 937–946, Republic and Canton of Geneva, CHE. International World Wide Web Conferences Steering Committee.
  9. Generating sentences by editing prototypes. Transactions of the Association for Computational Linguistics, 6:437–450.
  10. Ssd-lm: Semi-autoregressive simplex-based diffusion language model for text generation and modular control. arXiv preprint arXiv:2210.17432.
  11. Ctrl: A conditional transformer language model for controllable generation. arXiv preprint arXiv:1909.05858.
  12. Diederik P Kingma and Max Welling. 2013. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114.
  13. Gedi: Generative discriminator guided sequence generation. arXiv preprint arXiv:2009.06367.
  14. Controlled text generation as continuous optimization with multiple constraints. Advances in Neural Information Processing Systems, 34:14542–14554.
  15. Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv preprint arXiv:1910.13461.
  16. Delete, retrieve, generate: a simple approach to sentiment and style transfer. arXiv preprint arXiv:1804.06437.
  17. Xiang Lisa Li and Percy Liang. 2021. Prefix-tuning: Optimizing continuous prompts for generation. arXiv preprint arXiv:2101.00190.
  18. Diffusion-lm improves controllable text generation. arXiv preprint arXiv:2205.14217.
  19. DExperts: Decoding-time controlled text generation with experts and anti-experts. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 6691–6706, Online. Association for Computational Linguistics.
  20. Roberta: A robustly optimized BERT pretraining approach. CoRR, abs/1907.11692.
  21. On faithfulness and factuality in abstractive summarization. arXiv preprint arXiv:2005.00661.
  22. Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35:27730–27744.
  23. Understanding factuality in abstractive summarization with frank: A benchmark for factuality metrics. arXiv preprint arXiv:2104.13346.
  24. Bleu: A method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, ACL ’02, page 311–318, USA. Association for Computational Linguistics.
  25. Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pages 1532–1543.
  26. Discriminative adversarial search for abstractive summarization. In International Conference on Machine Learning, pages 8555–8564. PMLR.
  27. Societal biases in language generation: Progress and challenges. arXiv preprint arXiv:2105.04054.
  28. Taming ai bots: Controllability of neural states in large language models. arXiv preprint arXiv:2305.18449.
  29. “transforming” delete, retrieve, generate approach for controlled text style transfer. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 3269–3279, Hong Kong, China. Association for Computational Linguistics.
  30. Controllable neural story plot generation via reward shaping. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence. International Joint Conferences on Artificial Intelligence Organization.
  31. How quantifying the shape of stories predicts their success.
  32. Xsede: Accelerating scientific discovery. Computing in Science & Engineering, 16(05):62–74.
  33. Attention is all you need. Advances in neural information processing systems, 30.
  34. Universal adversarial triggers for attacking and analyzing nlp. arXiv preprint arXiv:1908.07125.
  35. Topic-guided variational autoencoders for text generation. arXiv preprint arXiv:1903.07137.
  36. Huggingface’s transformers: State-of-the-art natural language processing. CoRR, abs/1910.03771.
  37. On variational learning of controllable representations for text without supervision. In International Conference on Machine Learning, pages 10534–10543. PMLR.
  38. Yelp. 2017. Yelp dataset challenge, round 8.
  39. Defending against neural fake news. Advances in neural information processing systems, 32.
  40. A survey of controllable text generation using transformer-based pre-trained language models. ArXiv, abs/2201.05337.
  41. Bertscore: Evaluating text generation with bert. arXiv preprint arXiv:1904.09675.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Samraj Moorjani (3 papers)
  2. Adit Krishnan (11 papers)
  3. Hari Sundaram (46 papers)
Citations (1)