Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Causally Constrained Data Synthesis for Private Data Release (2105.13144v1)

Published 27 May 2021 in cs.LG and cs.CR

Abstract: Making evidence based decisions requires data. However for real-world applications, the privacy of data is critical. Using synthetic data which reflects certain statistical properties of the original data preserves the privacy of the original data. To this end, prior works utilize differentially private data release mechanisms to provide formal privacy guarantees. However, such mechanisms have unacceptable privacy vs. utility trade-offs. We propose incorporating causal information into the training process to favorably modify the aforementioned trade-off. We theoretically prove that generative models trained with additional causal knowledge provide stronger differential privacy guarantees. Empirically, we evaluate our solution comparing different models based on variational auto-encoders (VAEs), and show that causal information improves resilience to membership inference, with improvements in downstream utility.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Varun Chandrasekaran (39 papers)
  2. Darren Edge (6 papers)
  3. Somesh Jha (112 papers)
  4. Amit Sharma (88 papers)
  5. Cheng Zhang (388 papers)
  6. Shruti Tople (28 papers)
Citations (3)

Summary

We haven't generated a summary for this paper yet.