Model-based Differentially Private Data Synthesis and Statistical Inference in Multiply Synthetic Differentially Private Data (1606.08052v3)
Abstract: We propose the approach of model-based differentially private synthesis (modips) in the Bayesian framework for releasing individual-level surrogate/synthetic datasets with privacy guarantees given the original data. The modips technique integrates the concept of differential privacy into model-based data synthesis. We introduce several variants for the general modips approach and different procedures to obtaining privacy-preserving posterior samples, a key step in modips. The uncertainty from the sanitization and synthetic process in modips can be accounted for by releasing multiple synthetic datasets and quantified via an inferential combination rule that is proposed in this paper. We run empirical studies to examine the impacts of the number of synthetic sets and the privacy budget allocation schemes on the inference based on synthetic data.