Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Flexible text generation for counterfactual fairness probing (2206.13757v1)

Published 28 Jun 2022 in cs.CL and cs.CY

Abstract: A common approach for testing fairness issues in text-based classifiers is through the use of counterfactuals: does the classifier output change if a sensitive attribute in the input is changed? Existing counterfactual generation methods typically rely on wordlists or templates, producing simple counterfactuals that don't take into account grammar, context, or subtle sensitive attribute references, and could miss issues that the wordlist creators had not considered. In this paper, we introduce a task for generating counterfactuals that overcomes these shortcomings, and demonstrate how LLMs can be leveraged to make progress on this task. We show that this LLM-based method can produce complex counterfactuals that existing methods cannot, comparing the performance of various counterfactual generation methods on the Civil Comments dataset and showing their value in evaluating a toxicity classifier.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Vera Axelrod (9 papers)
  2. Ben Packer (11 papers)
  3. Alex Beutel (52 papers)
  4. Jilin Chen (32 papers)
  5. Kellie Webster (14 papers)
  6. Zee fryer (3 papers)
Citations (16)

Summary

We haven't generated a summary for this paper yet.