Replacing Language Model for Style Transfer (2211.07343v2)
Abstract: We introduce replacing LLM (RLM), a sequence-to-sequence LLMing framework for text style transfer (TST). Our method autoregressively replaces each token of the source sentence with a text span that has a similar meaning but in the target style. The new span is generated via a non-autoregressive masked LLM, which can better preserve the local-contextual meaning of the replaced token. This RLM generation scheme gathers the flexibility of autoregressive models and the accuracy of non-autoregressive models, which bridges the gap between sentence-level and word-level style transfer methods. To control the generation style more precisely, we conduct a token-level style-content disentanglement on the hidden representations of RLM. Empirical results on real-world text datasets demonstrate the effectiveness of RLM compared with other TST baselines. The code is at https://github.com/Linear95/RLM.
- David Barber Felix Agakov. The im algorithm: a variational approach to information maximization. Advances in neural information processing systems, 16(320):201, 2004.
- Why exposure bias matters: An imitation learning perspective of error accumulation in language generation. In Findings of the Association for Computational Linguistics: ACL 2022, pages 700–710, 2022.
- A neural probabilistic language model. Advances in neural information processing systems, 13, 2000.
- Steven Bird. Nltk: the natural language toolkit. In Proceedings of the COLING/ACL 2006 Interactive Presentation Sessions, pages 69–72, 2006.
- Bayesian inference in statistical analysis. John Wiley & Sons, 2011.
- An estimate of an upper bound for the entropy of english. Computational Linguistics, 18(1):31–40, 1992.
- Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020.
- A socio-emotional model of impoliteness for non-player characters. In 2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops, pages 1–7. IEEE, 2009.
- Club: A contrastive log-ratio upper bound of mutual information. In International conference on machine learning, pages 1779–1788. PMLR, 2020a.
- Improving disentangled text representation learning with information-theoretic guidance. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 7530–7541, 2020b.
- Style transformer: Unpaired text style transfer without disentangled latent representation. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 5997–6007, 2019.
- Making pre-trained language models better few-shot learners. In Association for Computational Linguistics (ACL), 2021.
- A probabilistic formulation of unsupervised text style transfer. In International Conference on Learning Representations, 2019.
- Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
- Learning to write with cooperative discriminators. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1638–1649, 2018.
- Toward controlled generation of text. In International conference on machine learning, pages 1587–1596. PMLR, 2017.
- Nast: A non-autoregressive generator with word alignment for unsupervised text style transfer. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pages 1577–1590, 2021.
- Disentangled representation learning for non-parallel text style transfer. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 424–434, 2019.
- Jacob Devlin Ming-Wei Chang Kenton and Lee Kristina Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of NAACL-HLT, pages 4171–4186, 2019.
- Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114, 2013.
- Solomon Kullback. Information theory and statistics. Courier Corporation, 1997.
- Thank you bart! rewarding pre-trained models improves formality style transfer. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), pages 484–494, 2021.
- Multiple-attribute text rewriting. In International Conference on Learning Representations, 2018.
- Enhancing content preservation in text style transfer using reverse attention and conditional layer normalization. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021.
- Diverse image-to-image translation via disentangled representations. In Proceedings of the European conference on computer vision (ECCV), pages 35–51, 2018.
- Text revision by on-the-fly representation optimization. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 32, 2022.
- Delete, retrieve, generate: a simple approach to sentiment and style transfer. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 1865–1874, 2018.
- Roberta: A robustly optimized BERT pretraining approach. CoRR, abs/1907.11692, 2019a.
- Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692, 2019b.
- Challenging common assumptions in the unsupervised learning of disentangled representations. In international conference on machine learning, pages 4114–4124. PMLR, 2019.
- Decoupled weight decay regularization. In International Conference on Learning Representations, 2018.
- Politeness transfer: A tag and generate approach. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 1869–1881, 2020.
- Unsupervised text style transfer with padded masked language models. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 8671–8680, 2020.
- Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting of the Association for Computational Linguistics, pages 311–318, 2002.
- On variational bounds of mutual information. In International Conference on Machine Learning, pages 5171–5180. PMLR, 2019.
- Bang: Bridging autoregressive and non-autoregressive generation with large scale pretraining. In International Conference on Machine Learning, pages 8630–8639. PMLR, 2021.
- Discovering the style information in texts via a reinforced decision process. In 2021 International Joint Conference on Neural Networks (IJCNN), pages 1–8. IEEE, 2021.
- Language models are unsupervised multitask learners. OpenAI blog, 1(8):9, 2019.
- Dear sir or madam, may i introduce the gyafc dataset: Corpus, benchmarks and metrics for formality style transfer. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 129–140, 2018.
- Lewis: Levenshtein editing for unsupervised text style transfer. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pages 3932–3944, 2021.
- Style transfer from non-parallel text by cross-alignment. Advances in neural information processing systems, 30, 2017.
- Blank language models. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 5186–5198, 2020.
- “transforming” delete, retrieve, generate approach for controlled text style transfer. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 3269–3279, 2019.
- Transferable persona-grounded dialogues via grounded minimal edits. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 2368–2382, 2021.
- ” mask and infill”: Applying masked language model to sentiment transfer. arXiv preprint arXiv:1908.08039, 2019.
- On variational learning of controllable representations for text without supervision. In Proceedings of the 37th International Conference on Machine Learning, ICML 2020, 13-18 July 2020, Virtual Event, volume 119, pages 10534–10543, 2020.
- Unsupervised text style transfer using language models as discriminators. Advances in Neural Information Processing Systems, 31, 2018.
- Improving zero-shot voice style transfer via disentangled representation learning. In International Conference on Learning Representations, 2020.
- English education game using non-player character based on natural language processing. Procedia Computer Science, 161:502–508, 2019.
- Personalizing dialogue agents: I have a dog, do you have pets too? In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2204–2213, 2018.
- Pengyu Cheng (23 papers)
- Ruineng Li (3 papers)