Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Distilling Text Style Transfer With Self-Explanation From LLMs (2403.01106v2)

Published 2 Mar 2024 in cs.CL and cs.AI

Abstract: Text Style Transfer (TST) seeks to alter the style of text while retaining its core content. Given the constraints of limited parallel datasets for TST, we propose CoTeX, a framework that leverages LLMs alongside chain-of-thought (CoT) prompting to facilitate TST. CoTeX distills the complex rewriting and reasoning capabilities of LLMs into more streamlined models capable of working with both non-parallel and parallel data. Through experimentation across four TST datasets, CoTeX is shown to surpass traditional supervised fine-tuning and knowledge distillation methods, particularly in low-resource settings. We conduct a comprehensive evaluation, comparing CoTeX against current unsupervised, supervised, in-context learning (ICL) techniques, and instruction-tuned LLMs. Furthermore, CoTeX distinguishes itself by offering transparent explanations for its style transfer process.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (59)
  1. Palm 2 technical report. CoRR, abs/2305.10403.
  2. Don’t lose the message while paraphrasing: A study on content preserving style transfer. In Natural Language Processing and Information Systems, pages 47–61, Cham. Springer Nature Switzerland.
  3. Kunal Chawla and Diyi Yang. 2020. Semi-supervised formality style transfer using language model discriminator and mutual information maximization. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 2340–2354, Online. Association for Computational Linguistics.
  4. Vicuna: An open-source chatbot impressing gpt-4 with 90%* chatgpt quality.
  5. Scaling instruction-finetuned language models. CoRR, abs/2210.11416.
  6. Text detoxification using large pre-trained neural models. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 7979–7996, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
  7. On measures of biases and harms in NLP. In Findings of the Association for Computational Linguistics: AACL-IJCNLP 2022, Online only, November 20-23, 2022, pages 246–267. Association for Computational Linguistics.
  8. SimCSE: Simple contrastive learning of sentence embeddings. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 6894–6910, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
  9. Reinforcement learning based text style transfer without parallel training corpus. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 3168–3180, Minneapolis, Minnesota. Association for Computational Linguistics.
  10. STEER: unified style transfer with expert reinforcement. In Findings of the Association for Computational Linguistics: EMNLP 2023, Singapore, December 6-10, 2023, pages 7546–7562. Association for Computational Linguistics.
  11. Detoxifying text with MaRCo: Controllable revision with experts and anti-experts. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 228–242, Toronto, Canada. Association for Computational Linguistics.
  12. A probabilistic formulation of unsupervised text style transfer. In 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020. OpenReview.net.
  13. Distilling the knowledge in a neural network. CoRR, abs/1503.02531.
  14. Distilling step-by-step! outperforming larger language models with less training data and smaller model sizes. In Findings of the Association for Computational Linguistics: ACL 2023, Toronto, Canada, July 9-14, 2023, pages 8003–8017. Association for Computational Linguistics.
  15. Large language models can self-improve. CoRR, abs/2210.11610.
  16. Shakespearizing modern language using copy-enriched sequence to sequence models. In Proceedings of the Workshop on Stylistic Variation, pages 10–19, Copenhagen, Denmark. Association for Computational Linguistics.
  17. Deep Learning for Text Style Transfer: A Survey. Computational Linguistics, 48(1):155–205.
  18. Reformulating unsupervised style transfer as paraphrase generation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, EMNLP 2020, Online, November 16-20, 2020, pages 737–762. Association for Computational Linguistics.
  19. Thank you BART! rewarding pre-trained models improves formality style transfer. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), pages 484–494, Online. Association for Computational Linguistics.
  20. Delete, retrieve, generate: a simple approach to sentiment and style transfer. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2018, New Orleans, Louisiana, USA, June 1-6, 2018, Volume 1 (Long Papers), pages 1865–1874. Association for Computational Linguistics.
  21. Explanations from large language models make small reasoners better. CoRR, abs/2210.06726.
  22. Semi-supervised formality style transfer with consistency training. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2022, Dublin, Ireland, May 22-27, 2022, pages 4689–4701. Association for Computational Linguistics.
  23. Revision in continuous space: Unsupervised text style transfer without adversarial learning. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, pages 8376–8383.
  24. Non-parallel text style transfer with self-parallel supervision. In International Conference on Learning Representations.
  25. A study on manual and automatic evaluation for text style transfer: The case of detoxification. In Proceedings of the 2nd Workshop on Human Evaluation of NLP Systems (HumEval), pages 90–101, Dublin, Ireland. Association for Computational Linguistics.
  26. ParaDetox: Detoxification with parallel data. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 6804–6818, Dublin, Ireland. Association for Computational Linguistics.
  27. Gender and representation bias in GPT-3 generated stories. In Proceedings of the Third Workshop on Narrative Understanding, pages 48–55, Virtual. Association for Computational Linguistics.
  28. A dual reinforcement learning framework for unsupervised text style transfer. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI 2019, Macao, China, August 10-16, 2019, pages 5116–5122. ijcai.org.
  29. Counterfactuals to control latent disentangled text representations for style transfer. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), pages 40–48, Online. Association for Computational Linguistics.
  30. Multi-task neural models for translating between styles within and across languages. In Proceedings of the 27th International Conference on Computational Linguistics, COLING 2018, Santa Fe, New Mexico, USA, August 20-26, 2018, pages 1008–1021. Association for Computational Linguistics.
  31. A call for standardization and validation of text style transfer evaluation. In Findings of the Association for Computational Linguistics: ACL 2023, pages 10791–10815, Toronto, Canada. Association for Computational Linguistics.
  32. Training language models to follow instructions with human feedback. In NeurIPS.
  33. Matt Post. 2018. A call for clarity in reporting BLEU scores. In Proceedings of the Third Conference on Machine Translation: Research Papers, pages 186–191, Brussels, Belgium. Association for Computational Linguistics.
  34. Matt Post and David Vilar. 2018. Fast lexically constrained decoding with dynamic beam allocation for neural machine translation. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2018, New Orleans, Louisiana, USA, June 1-6, 2018, Volume 1 (Long Papers), pages 1314–1324. Association for Computational Linguistics.
  35. Dongqi Pu and Vera Demberg. 2023. Chatgpt vs human-authored text: Insights into controllable text summarization and sentence style transfer. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics: Student Research Workshop, ACL 2023, Toronto, Canada, July 9-14, 2023, pages 1–18. Association for Computational Linguistics.
  36. Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res., 21:140:1–140:67.
  37. Sudha Rao and Joel Tetreault. 2018. Dear sir or madam, may I introduce the GYAFC dataset: Corpus, benchmarks and metrics for formality style transfer. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 129–140, New Orleans, Louisiana. Association for Computational Linguistics.
  38. Machel Reid and Victor Zhong. 2021. LEWIS: Levenshtein editing for unsupervised text style transfer. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pages 3932–3944, Online. Association for Computational Linguistics.
  39. A recipe for arbitrary text style transfer with large language models. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 837–848, Dublin, Ireland. Association for Computational Linguistics.
  40. Arkadiy Saakyan and Smaranda Muresan. 2023. ICLEF: in-context learning with expert feedback for explainable style transfer. CoRR, abs/2309.08583.
  41. Semi-supervised text style transfer: Cross projection in latent space. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, EMNLP-IJCNLP 2019, Hong Kong, China, November 3-7, 2019, pages 4936–4945. Association for Computational Linguistics.
  42. Style transfer from non-parallel text by cross-alignment. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA, pages 6830–6841.
  43. Distilling reasoning capabilities into smaller language models. In Findings of the Association for Computational Linguistics: ACL 2023, pages 7059–7073, Toronto, Canada. Association for Computational Linguistics.
  44. Prompt-and-rerank: A method for zero-shot and few-shot arbitrary textual style transfer with small language models. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 2195–2222, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
  45. Stanford alpaca: An instruction-following llama model. https://github.com/tatsu-lab/stanford_alpaca.
  46. Llama 2: Open foundation and fine-tuned chat models. CoRR, abs/2307.09288.
  47. Controllable unsupervised text attribute transfer via editing entangled latent representation. In Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8-14, 2019, Vancouver, BC, Canada, pages 11034–11044.
  48. SCOTT: Self-consistent chain-of-thought distillation. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 5546–5558, Toronto, Canada. Association for Computational Linguistics.
  49. Self-instruct: Aligning language models with self-generated instructions. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2023, Toronto, Canada, July 9-14, 2023, pages 13484–13508. Association for Computational Linguistics.
  50. Harnessing pre-trained neural networks with rules for formality style transfer. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, EMNLP-IJCNLP 2019, Hong Kong, China, November 3-7, 2019, pages 3571–3576. Association for Computational Linguistics.
  51. Chain-of-thought prompting elicits reasoning in large language models. In NeurIPS.
  52. Lamini-lm: A diverse herd of distilled models from large-scale instructions. CoRR, abs/2304.14402.
  53. Mask and infill: Applying masked language model for sentiment transfer. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI-19, pages 5271–5277. International Joint Conferences on Artificial Intelligence Organization.
  54. Transductive learning for unsupervised text style transfer. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, EMNLP 2021, Virtual Event / Punta Cana, Dominican Republic, 7-11 November, 2021, pages 2510–2521. Association for Computational Linguistics.
  55. Formality style transfer with hybrid textual annotations. CoRR, abs/1903.06353.
  56. Paraphrasing for style. In COLING 2012, 24th International Conference on Computational Linguistics, Proceedings of the Conference: Technical Papers, 8-15 December 2012, Mumbai, India, pages 2899–2914. Indian Institute of Technology Bombay.
  57. Text style transfer via learning style instance supported latent space. In Proceedings of the Twenty-Ninth International Conference on International Joint Conferences on Artificial Intelligence, pages 3801–3807.
  58. Parallel data augmentation for formality style transfer. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, July 5-10, 2020, pages 3221–3228. Association for Computational Linguistics.
  59. Siren’s song in the AI ocean: A survey on hallucination in large language models. CoRR, abs/2309.01219.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Chiyu Zhang (35 papers)
  2. Honglong Cai (5 papers)
  3. Yuezhang (3 papers)
  4. Li (46 papers)
  5. Yuexin Wu (23 papers)
  6. Le Hou (36 papers)
  7. Muhammad Abdul-Mageed (102 papers)
Citations (5)