Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
167 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A systematic comparison of measures for k-anonymity in networks (2407.02290v1)

Published 2 Jul 2024 in cs.SI

Abstract: Privacy-aware sharing of network data is a difficult task due to the interconnectedness of individuals in networks. An important part of this problem is the inherently difficult question of how in a particular situation the privacy of an individual node should be measured. To that end, in this paper we propose a set of aspects that one should consider when choosing a measure for privacy. These aspects include the type of desired privacy and attacker scenario against which the measure protects, utility of the data, the type of desired output, and the computational complexity of the chosen measure. Based on these aspects, we provide a systematic overview of existing approaches in the literature. We then focus on a set of measures that ultimately enables our objective: sharing the anonymized full network dataset with limited disclosure risk. The considered measures, each based on the concept of k-anonymity, account for the structure of the surroundings of a certain node and differ in completeness and reach of the structural information taken into account. We present a comprehensive theoretical characterization as well as comparative empirical experiments on a wide range of real-world network datasets with up to millions of edges. We find that the choice of the measure has an enormous effect on aforementioned aspects. Most interestingly, we find that the most effective measures consider a greater node vicinity, yet utilize minimal structural information and thus use minimal computational resources. This finding has important implications for researchers and practitioners, who may, based on the recommendations given in this paper, make an informed choice on how to safely share large-scale network data in a privacy-aware manner.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (85)
  1. Centrality measures in complex networks: A survey. arXiv preprint arXiv:2011.07190, 2020.
  2. Empirical comparison of algorithms for network community detection. In Proceedings of the 19th International Conference on World Wide Web, page 631–640, 2010.
  3. Network anomaly detection: methods, systems and tools. IEEE Communications Surveys & Tutorials, 16(1):303–336, 2013.
  4. Epidemics on networks: Reducing disease transmission using health emergency declarations and peer communication. Infectious Disease Modelling, 5:12–22, 2020.
  5. Socio-economic segregation in a population-scale social network. 2023.
  6. Measuring segregation in social networks. Social Networks, 39:14–32, 2014.
  7. Anomaly detection in online social networks. Social networks, 39:62–70, 2014.
  8. The anatomy of a population-scale social network. Scientific Reports, 13(1):9209, 2023.
  9. Jan Van der Laan. A person network of the netherlands. Discussion paper, Statistics Netherlands, The Hague, April 2022. https://www.cbs.nl/en-gb/background/2022/20/a-person-network-of-the-netherlands.
  10. Wherefore art thou r3579x? anonymized social networks, hidden patterns, and structural steganography. In Proceedings of the 16th International Conference on World Wide Web, page 181–190, 2007.
  11. Privacy and uniqueness of neighborhoods in social networks. Scientific Reports, 11(1):20104, 2021.
  12. The effect of distant connections on node anonymity in complex networks. Scientific Reports, 14(1):1156, 2024.
  13. Statistical Disclosure Control, volume 2. Wiley New York, 2012.
  14. Elements of statistical disclosure control, volume 155. Springer Science & Business Media, 2001.
  15. Cynthia Dwork. Differential Privacy: A Survey of Results. In Theory and Applications of Models of Computation, pages 1–19, Berlin, Heidelberg, 2008. Springer.
  16. Calibrating noise to sensitivity in private data analysis. In Theory of Cryptography, pages 265–284, Berlin, Heidelberg, 2006.
  17. Latanya Sweeney. k-anonymity: A model for protecting privacy. International journal of uncertainty, fuzziness and knowledge-based systems, 10(05):557–570, 2002.
  18. L-diversity: Privacy beyond k-anonymity. ACM Transactions on Knowledge Discovery from Data, 1(1), 2007.
  19. t-Closeness: Privacy Beyond k-Anonymity and l-Diversity. In 2007 IEEE 23rd International Conference on Data Engineering, pages 106–115, April 2007. ISSN: 2375-026X.
  20. (α𝛼\alphaitalic_α, k)-anonymity: an enhanced k-anonymity model for privacy preserving data publishing. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, page 754–759, 2006.
  21. Jörg Drechsler. Synthetic datasets for statistical disclosure control: theory and implementation, volume 201. Springer Science & Business Media, New York, NY, 2011.
  22. Anatomy: simple and effective privacy preservation. In Proceedings of the 32nd International Conference on Very Large Data Bases, page 139–150. VLDB Endowment, 2006.
  23. Sharing graphs using differentially private graph models. In Proceedings of the 2011 ACM SIGCOMM Conference on Internet Measurement Conference, page 81–98. Association for Computing Machinery, 2011.
  24. Calibrating data to sensitivity in private data analysis: A platform for differentially-private analysis of weighted datasets. Proceedings of the VLDB Endowment, 7(8):637–648, 2014.
  25. Preserving differential privacy in degree-correlation based graph generation. Transactions on data privacy, 6(2):127, 2013.
  26. Towards identity anonymization on graphs. In Proceedings of the ACM SIGMOD International Conference on Management of Data, page 93–106, 2008.
  27. Resisting structural re-identification in anonymized social networks. volume 1, page 102–114. VLDB Endowment, aug 2008.
  28. K-automorphism: a general framework for privacy preserving network publication. In Proceedings of the of the 35th VLDB Endowment, volume 2, pages 946–957. VLDB Endowment, 2009.
  29. Randomizing social networks: a spectrum preserving approach. In Proceedings of the 2008 SIAM International Conference on Data Mining, pages 739–750, 2008.
  30. Smartwalk: Enhancing social network security via adaptive random walks. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, page 492–503, 2016.
  31. Preserving link privacy in social network based systems. arXiv preprint arXiv:1208.6189, 2012.
  32. Data and structural k-anonymity in social networks. In Privacy, Security, and Trust in KDD, pages 33–54, Berlin, Heidelberg, 2009. Springer.
  33. Class-based graph anonymization for social network data. volume 2, page 766–777. VLDB Endowment, aug 2009.
  34. Linkmirage: Enabling privacy-preserving analytics on social relationships. In Proceedings of the 23rd Annual Network and Distributed System Security Symposium, 2016.
  35. k-anonymity on graphs using the szemerédi regularity lemma. IEEE Transactions on Network Science and Engineering, 8(2):1283–1292, 2020.
  36. Evolutionary algorithms for k-anonymity in social networks based on clustering approach. The Computer Journal, 63(7):1039–1062, 2020.
  37. P-sensitive k-anonymity for social networks. Proceedings of the 5th International Conference on Data Mining, 9:403–409, 2009.
  38. Anonymizing social networks. Computer Science Department Faculty Publication Series, page 180, 2007.
  39. k-symmetry model for identity anonymization in social networks. In Proceedings of the 13th International Conference on Extending Database Technology, page 111–122, 2010.
  40. A level-cut heuristic-based clustering approach for social graph anonymization. Social Network Analysis and Mining, 7:1–13, 2017.
  41. Graph data anonymization, de-anonymization attacks, and de-anonymizability quantification: A survey. IEEE Communications Surveys & Tutorials, 19(2):1305–1326, 2016.
  42. Applications of differential privacy in social network analysis: A survey. IEEE Transactions on Knowledge and Data Engineering, 35(1):108–127, 2021.
  43. An algorithm for k-degree anonymity on large networks. In Proceedings of the 2013 IEEE/ACM international conference on advances in social networks analysis and mining, pages 671–675, 2013.
  44. Fast identity anonymization on graphs. In Database and Expert Systems Applications: 23rd International Conference, pages 281–295, 2012.
  45. Preserving privacy in social networks against neighborhood attacks. In Proceedings of the 24th IEEE International Conference on Data Engineering, pages 506–515, 2008.
  46. SecGraph: A uniform and open-source evaluation system for graph data anonymization and de-anonymization. In 24th USENIX Security Symposium, pages 303–318, 2015.
  47. A survey on privacy in social media: Identification, mitigation, and applications. ACM Transactions on Data Science, 1(1):1–38, 2020.
  48. De-anonymizing social networks. In Proceedings of the 30th IEEE Symposium on Security and Privacy, pages 173–187, 2009.
  49. Effective social graph deanonymization based on graph structure and descriptive information. ACM Transactions on Intelligent Systems and Technology, 6(4), 2015.
  50. A brief survey on anonymization techniques for privacy preserving publishing of social network data. ACM Sigkdd Explorations Newsletter, 10(2):12–22, 2008.
  51. The k-anonymity and l-diversity approaches for privacy preservation in social networks against neighborhood attacks. Knowledge and Information Systems, 28(1):47–77, 2011.
  52. Boosting the accuracy of differentially private histograms through consistency. Proceedings of the VLDB Endowment, 3(1–2):1021–1032, 2010.
  53. Node differential privacy in social graph degree publishing. Procedia computer science, 143:786–793, 2018.
  54. Differentially private network data release via structural inference. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, page 911–920, 2014.
  55. Injecting uncertainty in graphs for identity obfuscation. volume 5, page 1376–1387. VLDB Endowment, 2012.
  56. Cynthia Dwork. Differential privacy. In Automata, Languages and Programming, pages 1–12, Berlin, Heidelberg, 2006. Springer.
  57. Publishing graphs under node differential privacy. IEEE Transactions on Knowledge and Data Engineering, 35(4):4164–4177, 2023.
  58. Systematic topology analysis and generation using degree correlations. ACM SIGCOMM Computer Communication Review, 36(4):135–146, 2006.
  59. Emergence of scaling in random networks. Science, 286(5439):509–512, 1999.
  60. Collective dynamics of ‘small-world’ networks. Nature, 393(6684):440–442, 1998.
  61. On the privacy of dk-random graphs. CoRR, abs/1907.01695, 2019.
  62. Network biology: understanding the cell’s functional organization. Nature reviews genetics, 5(2):101–113, 2004.
  63. Practical graph isomorphism, ii. Journal of Symbolic Computation, 60:94–112, 2014.
  64. k-degree anonymity model for social network data publishing. Advances in Electrical & Computer Engineering, 17(4):117 – 124, 2017.
  65. Algorithms for efficiently computing structural anonymity in complex networks. ACM Journal of Experimental. Algorithmics, 28, 2023.
  66. When the attacker knows a lot: The gaga graph anonymizer. In Proceedings of the 21st Springer International Conference on Information Security, pages 211–230, 2019.
  67. Outsourcing privacy-preserving social networks to a cloud. In 2013 Proceedings IEEE INFOCOM, pages 2886–2894, 2013.
  68. A graph modification approach for k-anonymity in social networks using the genetic algorithm. Social Network Analysis and Mining, 10:1–17, 2020.
  69. Large-scale dynamic social network directed graph k-in&out-degree anonymity algorithm for protecting community structure. IEEE Access, 7:108371–108383, 2019.
  70. Graph anonymization using hierarchical clustering. In Computational Intelligence in Data Mining, pages 145–154. Springer, 2019.
  71. Privacy-preserving social network publication against friendship attacks. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, page 1262–1270, 2011.
  72. BK Tripathy and Anirban Mitra. An algorithm to achieve k-anonymity and l-diversity anonymisation in social networks. In 2012 Fourth International Conference on Computational Aspects of Social Networks, pages 126–131, 2012.
  73. A personalized-anonymity model of social network for protecting privacy. Wireless Communications and Mobile Computing, 2022, 2022.
  74. K-isomorphism: privacy preserving network publication against structural attacks. In Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data, page 459–470, 2010.
  75. Resisting re-identification mining on social graph data. World Wide Web, 21:1759–1771, 2018.
  76. Personalized privacy protection in social networks. Proceedings of the VLDB Endowment, 4(2):141–150, 2010.
  77. K-anonymity for social networks containing rich structural and textual information. Social Network Analysis and Mining, 4(1):223, 2014.
  78. K-anonymity against neighborhood attacks in weighted social networks. Security and Communication Networks, 8(18):3864–3882, 2015.
  79. Jérôme Kunegis. Konect: the koblenz network collection. In Proceedings of the 22nd International Conference on World Wide Web, page 1343–1350, 2013.
  80. Sociopatterns. Sociopatterns: Datasets, 2021.
  81. The copenhagen networks study interaction data. figshare. https://doi.org/10.6084/m9.figshare.7267433.v1 (last accessed May 2022), 2019.
  82. The network data repository with interactive graph analytics and visualization. In Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, page 4292–4293, 2015.
  83. Snap datasets: Stanford large network dataset collection. http://snap.stanford.edu/data (last accessed May 2022), 2014.
  84. Biosnap datasets: Stanford. biomedical network dataset collection. http://snap.stanford.edu/biodata (last accessed May 2023), 2018.
  85. Michael Fire. Data 4 good lab. https://data4goodlab.github.io/MichaelFire/#section3 (last accessed May 2023), 2020.
Citations (1)

Summary

We haven't generated a summary for this paper yet.