Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Evaluating the Fairness Impact of Differentially Private Synthetic Data (2205.04321v2)

Published 9 May 2022 in cs.LG

Abstract: Differentially private (DP) synthetic data is a promising approach to maximizing the utility of data containing sensitive information. Due to the suppression of underrepresented classes that is often required to achieve privacy, however, it may be in conflict with fairness. We evaluate four DP synthesizers and present empirical results indicating that three of these models frequently degrade fairness outcomes on downstream binary classification tasks. We draw a connection between fairness and the proportion of minority groups present in the generated synthetic data, and find that training synthesizers on data that are pre-processed via a multi-label undersampling method can promote more fair outcomes without degrading accuracy.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Blake Bullwinkel (7 papers)
  2. Kristen Grabarz (2 papers)
  3. Lily Ke (1 paper)
  4. Scarlett Gong (1 paper)
  5. Chris Tanner (18 papers)
  6. Joshua Allen (7 papers)
Citations (6)