Synthetic Data: Methods, Use Cases, and Risks (2303.01230v3)

Published 1 Mar 2023 in cs.CR, cs.AI, and cs.CY

Abstract: Sharing data can often enable compelling applications and analytics. However, more often than not, valuable datasets contain information of a sensitive nature, and thus, sharing them can endanger the privacy of users and organizations. A possible alternative gaining momentum in both the research community and industry is to share synthetic data instead. The idea is to release artificially generated datasets that resemble the actual data -- more precisely, having similar statistical properties. In this article, we provide a gentle introduction to synthetic data and discuss its use cases, the privacy challenges that are still unaddressed, and its inherent limitations as an effective privacy-enhancing technology.

PDF HTML Abstract

Summarize PDF Markdown Bookmark Chat (Pro)

References (19)

Authors (1)

Emiliano De Cristofaro (117 papers)

Citations (8)

View on Semantic Scholar

Synthetic Data: Methods, Use Cases, and Risks (2303.01230v3)

Related Papers