DiffuDetox: A Mixed Diffusion Model for Text Detoxification (2306.08505v1)
Abstract: Text detoxification is a conditional text generation task aiming to remove offensive content from toxic text. It is highly useful for online forums and social media, where offensive content is frequently encountered. Intuitively, there are diverse ways to detoxify sentences while preserving their meanings, and we can select from detoxified sentences before displaying text to users. Conditional diffusion models are particularly suitable for this task given their demonstrated higher generative diversity than existing conditional text generation models based on LLMs. Nonetheless, text fluency declines when they are trained with insufficient data, which is the case for this task. In this work, we propose DiffuDetox, a mixed conditional and unconditional diffusion model for text detoxification. The conditional model takes toxic text as the condition and reduces its toxicity, yielding a diverse set of detoxified sentences. The unconditional model is trained to recover the input text, which allows the introduction of additional fluent text for training and thus ensures text fluency. Extensive experimental results and in-depth analysis demonstrate the effectiveness of our proposed DiffuDetox.
- Text detoxification using large pre-trained neural models. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 7979–7996.
- Fighting offensive language on social media with unsupervised text style transfer. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 189–194.
- Diffuseq: Sequence to sequence text generation with diffusion models. arXiv preprint arXiv:2210.08933.
- A probabilistic formulation of unsupervised text style transfer. arXiv preprint arXiv:2002.03912.
- Denoising diffusion probabilistic models. In Advances in Neural Information Processing Systems.
- Jonathan Ho and Tim Salimans. 2021. Classifier-free diffusion guidance. In NeurIPS 2021 Workshop on Deep Generative Models and Downstream Applications.
- Neural CRF model for sentence alignment in text simplification. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, July 5-10, 2020, pages 7943–7960. Association for Computational Linguistics.
- Diffwave: A versatile diffusion model for audio synthesis.
- Reformulating unsupervised style transfer as paraphrase generation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 737–762.
- Civil rephrases of toxic texts with self-supervised transformers. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pages 1442–1461.
- Joosung Lee. 2020. Stable style transformer: Delete and generate approach with encoder-decoder for text style transfer. In Proceedings of the 13th International Conference on Natural Language Generation, pages 195–204.
- Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 7871–7880.
- Delete, retrieve, generate: a simple approach to sentiment and style transfer. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 1865–1874.
- Diffusion-LM improves controllable text generation. In Advances in Neural Information Processing Systems.
- Paradetox: Detoxification with parallel data. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 6804–6818.
- On distillation of guided diffusion models.
- Language models are unsupervised multitask learners.
- Diffuser: Discrete diffusion via edit-based reconstruction.
- Step-unrolled denoising autoencoders for text generation. In International Conference on Learning Representations.
- Deep unsupervised learning using nonequilibrium thermodynamics. In Proceedings of the 32nd International Conference on Machine Learning, Proceedings of Machine Learning Research.
- Yang Song and Stefano Ermon. 2019. Generative modeling by estimating gradients of the data distribution.
- Score-based generative modeling through stochastic differential equations. In International Conference on Learning Representations.
- Self-conditioned embedding diffusion for text generation.
- Attention is all you need. Advances in neural information processing systems, 30.
- Roles of cyberbullying, sleep, and physical activity in mediating the effects of social media use on mental health and wellbeing among young people in england: a secondary analysis of longitudinal data. The Lancet Child & Adolescent Health, 3(10):685–696.
- GLUE: A multi-task benchmark and analysis platform for natural language understanding. In the Proceedings of ICLR.
- Neural network acceptability judgments. Transactions of the Association for Computational Linguistics, 7:625–641.
- Beyond bleu: Training neural machine translation with semantic similarity. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 4344–4355.
- Alone: A dataset for toxic behavior among adolescents on twitter. In International Conference on Social Informatics, pages 427–439. Springer.
- " mask and infill": Applying masked language model to sentiment transfer. arXiv preprint arXiv:1908.08039.
- Aligning books and movies: Towards story-like visual explanations by watching movies and reading books. In The IEEE International Conference on Computer Vision (ICCV).
- Griffin Floto (5 papers)
- Mohammad Mahdi Abdollah Pour (3 papers)
- Parsa Farinneya (3 papers)
- Zhenwei Tang (12 papers)
- Ali Pesaranghader (10 papers)
- Manasa Bharadwaj (8 papers)
- Scott Sanner (70 papers)