Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
156 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

An Algorithm for Streaming Differentially Private Data (2401.14577v2)

Published 26 Jan 2024 in cs.DB, cs.IT, cs.LG, math.IT, math.ST, and stat.TH

Abstract: Much of the research in differential privacy has focused on offline applications with the assumption that all data is available at once. When these algorithms are applied in practice to streams where data is collected over time, this either violates the privacy guarantees or results in poor utility. We derive an algorithm for differentially private synthetic streaming data generation, especially curated towards spatial datasets. Furthermore, we provide a general framework for online selective counting among a collection of queries which forms a basis for many tasks such as query answering and synthetic data generation. The utility of our algorithm is verified on both real-world and simulated datasets.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (47)
  1. A survey of differential privacy-based techniques and their applicability to location-based services. Computers & Security, 111:102464, 2021. ISSN 0167-4048. doi: https://doi.org/10.1016/j.cose.2021.102464. URL https://www.sciencedirect.com/science/article/pii/S0167404821002881.
  2. Deep learning with differential privacy. In Proceedings of the 2016 ACM SIGSAC conference on computer and communications security, pp.  308–318, 2016.
  3. Census topdown: Differentially private data, incremental schemas, and consistency with public knowledge. 2019.
  4. Location-based advertising on mobile devices: A literature review and analysis. Management review quarterly, 66(3):159–194, 2016.
  5. Boeing, G. Osmnx: A python package to work with graph-theoretic openstreetmap street networks. Journal of Open Source Software, 2(12):215, 2017. doi: 10.21105/joss.00215. URL https://doi.org/10.21105/joss.00215.
  6. Continual release of differentially private synthetic data, 2023.
  7. Private and continual release of statistics. ACM Trans. Inf. Syst. Secur., 14:26:1–26:24, 2010.
  8. Pegasus: Data-adaptive differentially private stream processing. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, pp.  1375–1388, 2017.
  9. Friendship and mobility: User movement in location-based social networks. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’11, pp.  1082–1090, New York, NY, USA, 2011. Association for Computing Machinery. ISBN 9781450308137. doi: 10.1145/2020408.2020579. URL https://doi.org/10.1145/2020408.2020579.
  10. Collecting telemetry data privately. In NIPS, 2017.
  11. New york city taxi trip data (2010-2013), 2016. URL https://doi.org/10.13012/J8PN93H8.
  12. Anonymizing nyc taxi data: Does it matter? In 2016 IEEE international conference on data science and advanced analytics (DSAA), pp.  140–148. IEEE, 2016.
  13. Differential privacy under continual observation. In Symposium on the Theory of Computing, 2010.
  14. Pure differential privacy for rectangle queries via private partitions. In International Conference on the Theory and Application of Cryptology and Information Security, 2015.
  15. Rappor: Randomized aggregatable privacy-preserving ordinal response. In Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security, CCS ’14, pp.  1054–1067, New York, NY, USA, 2014. Association for Computing Machinery. ISBN 9781450329576. doi: 10.1145/2660267.2660348. URL https://doi.org/10.1145/2660267.2660348.
  16. Histogramming privately ever after: Differentially-private data-dependent error bound optimisation. In 2018 IEEE 34th International Conference on Data Engineering (ICDE), pp.  1204–1207, 2018. doi: 10.1109/ICDE.2018.00111.
  17. Shapely, January 2023. URL https://github.com/shapely/shapely.
  18. (nearly) optimal algorithms for private online learning in full-information and bandit settings. In Burges, C., Bottou, L., Welling, M., Ghahramani, Z., and Weinberger, K. (eds.), Advances in Neural Information Processing Systems, volume 26. Curran Associates, Inc., 2013. URL https://proceedings.neurips.cc/paper_files/paper/2013/file/c850371fda6892fbfd1c5a5b457e5777-Paper.pdf.
  19. Exploring network structure, dynamics, and function using networkx. In Varoquaux, G., Vaught, T., and Millman, J. (eds.), Proceedings of the 7th Python in Science Conference, pp.  11 – 15, Pasadena, CA USA, 2008.
  20. A simple and practical algorithm for differentially private data release. Advances in neural information processing systems, 25, 2012.
  21. Counting distinct elements in the turnstile model with differential privacy under continual observation, 2023.
  22. A survey and experimental study on privacy-preserving trajectory data publishing. IEEE Transactions on Knowledge and Data Engineering, 35(6):5577–5596, 2023. doi: 10.1109/TKDE.2022.3174204.
  23. geopandas/geopandas: v0.8.1, July 2020. URL https://doi.org/10.5281/zenodo.3946761.
  24. Pate-gan: Generating synthetic data with differential privacy guarantees. In International conference on learning representations, 2019.
  25. Local differential privacy for evolving data. In Neural Information Processing Systems, 2018.
  26. Practical and private (deep) learning without sampling or shuffling. In Meila, M. and Zhang, T. (eds.), Proceedings of the 38th International Conference on Machine Learning, volume 139 of Proceedings of Machine Learning Research, pp.  5213–5225. PMLR, 18–24 Jul 2021. URL https://proceedings.mlr.press/v139/kairouz21b.html.
  27. Differentially private and skew-aware spatial decompositions for mobile crowdsensing. Sensors, 18(11):3696, 2018.
  28. Analysis of the passenger pick-up pattern for taxi location recommendation. In 2008 Fourth international conference on networked computing and advanced information management, volume 1, pp.  199–204. IEEE, 2008.
  29. Achieving differential privacy of trajectory data publishing in participatory sensing. Information Sciences, 400:1–13, 2017.
  30. Large language models can be strong differentially private learners. arXiv preprint arXiv:2110.05679, 2021.
  31. Hdmm: Optimizing error of high-dimensional statistical queries under differential privacy. arXiv preprint arXiv:2106.12118, 2021a.
  32. Winning the nist contest: A scalable and general approach to differentially private synthetic data. Journal of Privacy and Confidentiality, 11(3), 2021b.
  33. Dp-mcdbscan: Differential privacy preserving multi-core dbscan clustering for network user data. IEEE access, 6:21053–21063, 2018.
  34. Differentially private grids for geospatial data. In 2013 IEEE 29th international conference on data engineering (ICDE), pp.  757–768. IEEE, 2013.
  35. Analyzing the differentially private theil-sen estimator for simple linear regression. arXiv preprint arXiv:2207.13289, 2022.
  36. Privacy loss in apple’s implementation of differential privacy on macos 10.12, 2017.
  37. Benchmarking differentially private synthetic data generation algorithms. arXiv preprint arXiv:2112.09238, 2021.
  38. Team, D. P. Learning with privacy at scale. Technical report, Apple, December 2017. URL https://machinelearning.apple.com/research/learning-with-privacy-at-scale.
  39. Dp-cgan: Differentially private synthetic data and label generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp.  0–0, 2019.
  40. Continuous release of data streams under both centralized and local differential privacy. In Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security, pp.  1237–1253, 2021.
  41. Wikipedia contributors. Weather (Apple) — Wikipedia, the free encyclopedia, 2017. URL https://en.wikipedia.org/wiki/Weather_(Apple). [Online; accessed 02-June-2023].
  42. Ganobfuscator: Mitigating information leakage under gan via differential privacy. IEEE Transactions on Information Forensics and Security, 14(9):2358–2371, 2019.
  43. Trajectory recovery from ash: User privacy is not preserved in aggregated mobility data. In Proceedings of the 26th International Conference on World Wide Web, WWW ’17, pp.  1241–1250, Republic and Canton of Geneva, CHE, 2017. International World Wide Web Conferences Steering Committee. ISBN 9781450349130. doi: 10.1145/3038912.3052620. URL https://doi.org/10.1145/3038912.3052620.
  44. Synthetic text generation with differential privacy: A simple and practical recipe. arXiv preprint arXiv:2210.14348, 2022.
  45. Privtree: A differentially private algorithm for hierarchical decompositions. Proceedings of the 2016 International Conference on Management of Data, 2016.
  46. Privbayes: Private data release via bayesian networks. ACM Transactions on Database Systems (TODS), 42(4):1–41, 2017.
  47. Recommending pick-up points for taxi-drivers based on spatio-temporal clustering. In 2012 Second International Conference on Cloud and Green Computing, pp.  67–72. IEEE, 2012.
Citations (2)

Summary

We haven't generated a summary for this paper yet.