Making Differential Privacy Easier to Use for Data Controllers using a Privacy Risk Indicator (2310.13104v3)
Abstract: Differential privacy (DP) enables private data analysis but is difficult to use in practice. In a typical DP deployment, data controllers manage individuals' sensitive data and are responsible for answering data analysts' queries while protecting individuals' privacy; they do so by choosing $\epsilon$, the privacy loss budget, which controls how much noise to add to the query output. However, it is challenging for data controllers to choose $\epsilon$ because of the difficulty of interpreting the privacy implications of such a choice on the individuals they wish to protect. To address this challenge, we first derive a privacy risk indicator (PRI) directly from the definition of ex-post per-instance privacy loss in the DP literature. The PRI indicates the impact of choosing $\epsilon$ on individuals' privacy. We then leverage the PRI to design an algorithm to choose $\epsilon$ and release query output based on data controllers' privacy preferences. We design a modification of the algorithm that allows releasing both the query output and $\epsilon$ while satisfying differential privacy, and we propose a solution that bounds the total privacy loss when using the algorithm to answer multiple queries without requiring controllers to set the total privacy loss budget. We demonstrate our contributions through an IRB-approved user study and experimental evaluations that show the PRI is useful for helping controllers choose $\epsilon$ and our algorithms are efficient. Overall, our work contributes to making DP easier to use for controllers by lowering adoption barriers.
- [n.d.]. Fernet (symmetric encryption). https://cryptography.io/en/latest/fernet/.
- [n.d.]. Prolific. https://www.prolific.co.
- [n.d.]. UCI Adult dataset. https://www.kaggle.com/datasets/uciml/adult-census-income.
- DECLARATION OF JOHN M ABOWD. [n.d.]. IN THE UNITED STATES DISTRICT COURT FOR THE MIDDLE DISTRICT OF ALABAMA EASTERN DIVISION. ([n. d.]).
- John M Abowd. 2018. The US Census Bureau adopts differential privacy. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2867–2867.
- Encrypted databases for differential privacy. Cryptology ePrint Archive (2018).
- Apple. 2017. Learning with Privacy at Scale. https://machinelearning.apple.com/research/learning-with-privacy-at-scale
- Privacy amplification by subsampling: Tight analyses via couplings and divergences. Advances in Neural Information Processing Systems 31 (2018).
- SMCQL: Secure querying for federated databases. arXiv preprint arXiv:1606.06808 (2016).
- Shrinkwrap: efficient sql query processing in differentially private data federations. Proceedings of the VLDB Endowment 12, 3 (2018).
- Prochlo: Strong privacy for analytics in the crowd. In Proceedings of the 26th symposium on operating systems principles. 441–459.
- ε𝜀\varepsilonitalic_εpsolute: Efficiently Querying Databases While Providing Differential Privacy. In Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security. 2262–2276.
- ” I need a better description”: An Investigation Into User Expectations For Differential Privacy. In Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security. 3037–3052.
- Apple Differential Privacy Team. 2017. Learning with Privacy at Scale. (2017).
- Collecting telemetry data privately. Advances in Neural Information Processing Systems 30 (2017).
- Cynthia Dwork. 2006. Differential Privacy. In 33rd International Colloquium on Automata, Languages and Programming, part II (ICALP 2006) (33rd international colloquium on automata, languages and programming, part ii (icalp 2006) ed.) (Lecture Notes in Computer Science), Vol. 4052. Springer Verlag, 1–12. https://www.microsoft.com/en-us/research/publication/differential-privacy/
- Our data, ourselves: Privacy via distributed noise generation. In Annual international conference on the theory and applications of cryptographic techniques. Springer, 486–503.
- Differential privacy in practice: Expose your epsilons! Journal of Privacy and Confidentiality 9, 2 (2019).
- Calibrating noise to sensitivity in private data analysis. In Theory of cryptography conference. Springer, 265–284.
- On the complexity of differentially private data release: efficient algorithms and hardness results. In Proceedings of the forty-first annual ACM symposium on Theory of computing. 381–390.
- The algorithmic foundations of differential privacy. Foundations and Trends® in Theoretical Computer Science 9, 3–4 (2014), 211–407.
- Boosting and differential privacy. In 2010 IEEE 51st Annual Symposium on Foundations of Computer Science. IEEE, 51–60.
- Rappor: Randomized aggregatable privacy-preserving ordinal response. In Proceedings of the 2014 ACM SIGSAC conference on computer and communications security. 1054–1067.
- Am I Private and If So, how Many? Communicating Privacy Guarantees of Differential Privacy with Risk Communication Formats. In Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security. 1125–1139.
- Apex: Accuracy-aware differentially private data exploration. In Proceedings of the 2019 International Conference on Management of Data. 177–194.
- Exploring privacy-accuracy tradeoffs using dpcomp. In Proceedings of the 2016 International Conference on Management of Data. 2101–2104.
- Balancing data privacy and usability in the federal statistical system. Proceedings of the National Academy of Sciences 119, 31 (2022), e2104906119.
- Differential privacy: An economic method for choosing epsilon. In 2014 IEEE 27th Computer Security Foundations Symposium. IEEE, 398–410.
- Decision Support for Sharing Data Using Differential Privacy. In 2021 IEEE Symposium on Visualization for Cyber Security (VizSec). IEEE, 26–35.
- Conservative or liberal? Personalized differential privacy. In 2015 IEEE 31St international conference on data engineering. IEEE, 1023–1034.
- AMD memory encryption. White paper (2016), 13.
- What can we learn privately? SIAM J. Comput. 40, 3 (2011), 793–826.
- The use of differential privacy for census data and its impact on redistricting: The case of the 2020 US Census. Science advances 7, 41 (2021), eabk3283.
- Jaewoo Lee and Chris Clifton. 2011. How much is enough? choosing ϵitalic-ϵ\epsilonitalic_ϵ for differential privacy. In International Conference on Information Security. Springer, 325–340.
- Accuracy first: Selecting a differential privacy level for accuracy constrained erm. Advances in Neural Information Processing Systems 30 (2017).
- Understanding the sparse vector technique for differential privacy. arXiv preprint arXiv:1603.01699 (2016).
- Patrick E McKnight and Julius Najab. 2010. Mann-Whitney U Test. The Corsini encyclopedia of psychology (2010), 1–1.
- Frank McSherry and Kunal Talwar. 2007. Mechanism design via differential privacy. In 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS’07). IEEE, 94–103.
- Usable differential privacy: A case study with psi. arXiv preprint arXiv:1809.04103 (2018).
- Visualizing Privacy-Utility Trade-Offs in Differentially Private Data Releases. arXiv preprint arXiv:2201.05964 (2022).
- What Are the Chances? Explaining the Epsilon Parameter in Differential Privacy. In 32nd USENIX Security Symposium (USENIX Security 23). 1613–1630.
- Arjun Narayan and Andreas Haeberlen. 2012. {{\{{DJoin}}\}}: Differentially Private Join Queries over Distributed Databases. In 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI 12). 149–162.
- Differential Privacy for Databases. Foundations and Trends® in Databases 11, 2 (2021), 109–225.
- Duetsgx: Differential privacy with secure hardware. arXiv preprint arXiv:2010.10664 (2020).
- Privacy odometers and filters: Pay-as-you-go composition. Advances in Neural Information Processing Systems 29 (2016).
- Honeycrisp: large-scale differentially private aggregation without a trusted core. In Proceedings of the 27th ACM Symposium on Operating Systems Principles. 196–210.
- Cryptε𝜀\varepsilonitalic_ε: Crypto-assisted differential privacy on untrusted servers. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data. 603–619.
- Differential privacy and census data: Implications for social and economic research. In AEA papers and proceedings, Vol. 109. American Economic Association 2014 Broadway, Suite 305, Nashville, TN 37203, 403–408.
- How differential privacy will affect our understanding of health disparities in the United States. Proceedings of the National Academy of Sciences 117, 24 (2020), 13405–13412.
- Dpxplain: Privately explaining aggregate query answers. Proceedings of the VLDB Endowment 16, 1 (2022), 113 – 126.
- Overlook: Differentially Private Exploratory Visualization for Big Data. arXiv preprint arXiv:2006.12018 (2020).
- IncShrink: Architecting Efficient Outsourced Databases using Incremental MPC and Differential Privacy. arXiv preprint arXiv:2203.05084 (2022).
- Fully-adaptive composition in differential privacy. In International Conference on Machine Learning. PMLR, 36990–37007.
- Data station: delegated, trustworthy, and auditable computation to enable data-sharing consortia with a data escrow. Proceedings of the VLDB Endowment 15, 11 (2022), 3172–3185.
- Towards effective differential privacy communication for users’ data sharing decision and comprehension. In 2020 IEEE Symposium on Security and Privacy (SP). IEEE, 392–410.