Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

SynthDetoxM: Modern LLMs are Few-Shot Parallel Detoxification Data Annotators (2502.06394v1)

Published 10 Feb 2025 in cs.CL
SynthDetoxM: Modern LLMs are Few-Shot Parallel Detoxification Data Annotators

Abstract: Existing approaches to multilingual text detoxification are hampered by the scarcity of parallel multilingual datasets. In this work, we introduce a pipeline for the generation of multilingual parallel detoxification data. We also introduce SynthDetoxM, a manually collected and synthetically generated multilingual parallel text detoxification dataset comprising 16,000 high-quality detoxification sentence pairs across German, French, Spanish and Russian. The data was sourced from different toxicity evaluation datasets and then rewritten with nine modern open-source LLMs in few-shot setting. Our experiments demonstrate that models trained on the produced synthetic datasets have superior performance to those trained on the human-annotated MultiParaDetox dataset even in data limited setting. Models trained on SynthDetoxM outperform all evaluated LLMs in few-shot setting. We release our dataset and code to help further research in multilingual text detoxification.

Here's a detailed summary of the paper you provided, focusing on its key contributions, methodology, results, and significance:

Title: SynthDetoxM: Modern LLMs are Few-Shot Parallel Detoxification Data Annotators

Core Idea: The paper addresses the scarcity of parallel multilingual datasets for text detoxification. It proposes a pipeline to generate synthetic parallel detoxification data using few-shot prompting of LLMs. This pipeline results in SynthDetoxM, a new multilingual dataset of high-quality detoxification sentence pairs.

Key Contributions:

  1. Framework for Synthetic Data Generation: The paper introduces a methodology for generating synthetic parallel multilingual detoxification data. This framework leverages the few-shot learning capabilities of modern LLMs to rewrite toxic text into non-toxic equivalents while preserving meaning.
  2. SynthDetoxM Dataset: The major contribution is the creation of SynthDetoxM, a large-scale multilingual synthetic parallel dataset. It contains 16,000 detoxification pairs across four languages: German, Spanish, French, and Russian (4,000 pairs per language). The dataset is created by selecting the best generations from multiple open-source LLMs, combined using a hand-crafted heuristic. The dataset and code are publicly released.
  3. Empirical Evaluation: The paper includes a thorough evaluation of SynthDetoxM. This evaluation involves:
    • Linguistic analysis of the dataset.
    • Benchmarking against the human-annotated MultiParaDetox dataset.
    • Automatic evaluation using metrics like Style Transfer Accuracy (STA), Content Similarity (SIM), Fluency (FL), and a combined J-score.
    • Side-by-Side (SBS) evaluation using GPT-4o to judge the quality of detoxified outputs.

Methodology:

  1. Data Collection:
    • Non-parallel toxic texts are gathered from existing toxicity identification datasets in German, French, Spanish, and Russian.
    • Sample-level filtering is applied using STA and SIM metrics to ensure high data quality.
    • Data augmentation techniques are employed, utilizing the Perspective API to identify and isolate toxic spans within sentences.
  2. Parallel Data Generation:
    • Various open-source LLMs (Qwen 2.5 32B, Command-R 32B, Gemma 2 27B, Aya Expanse (32B and 8B), Mistral Small 22B, Mistral Nemo 12B, and Llama 3.1 (70B and 8B)) are used in a few-shot generation setup.
    • Few-shot examples (toxic/non-toxic pairs) are mined from a multilingual toxicity detection dataset, ranked based on STA and SIM scores to ensure high-quality detoxification and semantic preservation. For French, human annotators detoxified an initial set of sentences.
    • Generated examples are filtered to remove refusals and ensure sufficient detoxifiability.
    • Outputs from different LLMs are ranked and combined to create the final dataset, prioritizing diversity and quality.
  3. Evaluation:
    • Sequence-to-sequence models (mT0-XL) are trained on SynthDetoxM and compared against models trained on MultiParaDetox.
    • The quality of the generated data is assessed using STA and SIM scores from the Perspective API.
    • Automatic evaluation is performed using the metrics outlined in the original MultiParaDetox paper, including STA, SIM, FL, and J-score.
    • Side-by-Side (SBS) comparisons using GPT-4o as an evaluator are used to judge the quality of detoxified outputs from different models.

Key Results:

  • Models trained on SynthDetoxM consistently outperform those trained on the human-annotated MultiParaDetox dataset, even in data-limited settings.
  • Training on the full SynthDetoxM dataset results in models that surpass the performance of most LLMs in few-shot generation setups.
  • Side-by-side evaluations show a clear preference for the detoxifications produced by models trained on SynthDetoxM compared to those trained on MultiParaDetox.
  • French achieves comparable automatic metric scores to other languages, suggesting that detoxification models trained on this data would perform similarly.
  • Qwen 2.5 32B tended to generate the most preferable detoxifications.

Significance:

  • Addresses Data Scarcity: SynthDetoxM directly addresses the lack of parallel multilingual detoxification data, enabling the development of more effective multilingual detoxification models.
  • Cost-Effective Data Generation: The proposed framework provides a cost-effective alternative to manual data collection, reducing the annotation costs for parallel detoxification datasets.
  • Improved Detoxification Performance: The paper demonstrates that synthetic data generated using LLMs can achieve comparable or superior performance to human-annotated data, opening up new avenues for training detoxification models.
  • Multilingual Applicability: The findings and the SynthDetoxM dataset are valuable for researchers working on multilingual natural language processing, particularly in the areas of text style transfer and toxicity mitigation.
  • Outperforms LLMs in Few-Shot Setting: The model fine-tuned on the generated data outperforms many LLMs in a few-shot generation setup, which is a significant result.

Limitations:

  • The paper focuses only on explicit types of toxicity, not more subtle forms.
  • Definitions of toxicity can vary drastically between languages.
  • Computational resource constraints led to the use of smaller models for data generation.
  • The evaluation would be strengthened by comparison with proprietary models.
  • Limited by amount of annotated non-parallel toxic datasets in some of the languages.

Ethical Considerations:

  • The paper acknowledges the ethical responsibilities involved in working with text detoxification, emphasizing the goal of creating a safer and more inclusive online environment. It clarifies that the goal is not to suppress free speech but to offer non-toxic alternatives, encouraging users to choose better language. The risk of misuse (e.g., generating harmful content) is recognized.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Daniil Moskovskiy (9 papers)
  2. Nikita Sushko (1 paper)
  3. Sergey Pletenev (5 papers)
  4. Elena Tutubalina (36 papers)
  5. Alexander Panchenko (92 papers)
X Twitter Logo Streamline Icon: https://streamlinehq.com