Partition-based differentially private synthetic data generation (2310.06371v1)
Abstract: Private synthetic data sharing is preferred as it keeps the distribution and nuances of original data compared to summary statistics. The state-of-the-art methods adopt a select-measure-generate paradigm, but measuring large domain marginals still results in much error and allocating privacy budget iteratively is still difficult. To address these issues, our method employs a partition-based approach that effectively reduces errors and improves the quality of synthetic data, even with a limited privacy budget. Results from our experiments demonstrate the superiority of our method over existing approaches. The synthetic data produced using our approach exhibits improved quality and utility, making it a preferable choice for private synthetic data sharing.
- doi:10.1007/11681878_14. URL https://doi.org/10.1007/11681878_14
- arXiv:2011.05537. URL https://arxiv.org/abs/2011.05537
- doi:10.1561/0400000042. URL http://dx.doi.org/10.1561/0400000042
- doi:10.1007/978-3-319-57048-8_7. URL https://doi.org/10.1007/978-3-319-57048-8_7
- doi:10.2200/S00735ED1V01Y201609SPT018. URL https://doi.org/10.2200/S00735ED1V01Y201609SPT018
- doi:10.1145/3243734.3243742. URL https://doi.org/10.1145/3243734.3243742
- doi:10.1109/ICDE.2019.00151. URL https://doi.org/10.1109/ICDE.2019.00151
- doi:10.1007/978-3-030-10925-7_31. URL https://doi.org/10.1007/978-3-030-10925-7_31
- arXiv:1802.06739. URL http://arxiv.org/abs/1802.06739
- doi:10.1016/j.ins.2022.11.006. URL https://doi.org/10.1016/j.ins.2022.11.006
- doi:10.1109/DSAA54385.2022.10032429. URL https://doi.org/10.1109/DSAA54385.2022.10032429
- arXiv:1801.01594. URL http://arxiv.org/abs/1801.01594
- doi:10.1007/978-3-662-53641-4_24. URL https://doi.org/10.1007/978-3-662-53641-4_24
- Meifan Zhang (8 papers)
- Dihang Deng (1 paper)
- Lihua Yin (6 papers)