Seeding with Differentially Private Network Information (2305.16590v4)
Abstract: In public health interventions such as the distribution of preexposure prophylaxis (PrEP) for HIV prevention, decision makers rely on seeding algorithms to identify key individuals who can amplify the impact of their interventions. In such cases, building a complete sexual activity network is often infeasible due to privacy concerns. Instead, contact tracing can provide influence samples, that is, sequences of sexual contacts without requiring complete network information. This presents two challenges: protecting individual privacy in contact data and adapting seeding algorithms to work effectively with incomplete network information. To solve these two problems, we study privacy guarantees for influence maximization algorithms when the social network is unknown and the inputs are samples of prior influence cascades that are collected at random and need privacy protection. Building on recent results that address seeding with costly network information, our privacy-preserving algorithms introduce randomization in the collected data or the algorithm output and can bound the privacy loss of each node (or group of nodes) in deciding to include their data in the algorithm input. We provide theoretical guarantees of seeding performance with a limited sample size subject to differential privacy budgets in both central and local privacy regimes. Simulations on synthetic random graphs and empirically grounded sexual contacts of men who have sex with men reveal the diminishing value of network information with decreasing privacy budget in both regimes and graceful decrease in performance with decreasing privacy budget in the central regime. Achieving good performance with local privacy guarantees requires relatively higher privacy budgets that confirm our theoretical expectations.
- Apple Differential Privacy Team (2017) Learning with privacy at scale. https://machinelearning.apple.com/research/learning-with-privacy-at-scale, accessed: 2023-05-18.
- Benthall S, Cummings R (2022) Integrating differential privacy and contextual integrity. 2022 USENIX Conference on Privacy Engineering Practice and Respect.
- Garfinkel S (2022) Differential privacy and the 2020 US census. Technical report, MIT Schwarzman College of Computing.
- Hoeffding W (1994) Probability inequalities for sums of bounded random variables. The collected works of Wassily Hoeffding, 409–426 (Springer).
- Kifer D, Machanavajjhala A (2014) Pufferfish: A framework for mathematical privacy definitions. ACM Transactions on Database Systems (TODS) 39(1):1–36.
- Nissenbaum H (2009) Privacy in context. Privacy in Context (Stanford University Press).
- Stanford Network Analysis Project (2007) email-Eu-core network. https://snap.stanford.edu/data/email-Eu-core.html, accessed: 2023-05-25.
- Sweeney L (1997) Weaving technology and policy together to maintain confidentiality. The Journal of Law, Medicine & Ethics 25(2-3):98–110.
- Warner SL (1965) Randomized response: A survey technique for eliminating evasive answer bias. Journal of the American Statistical Association 60(309):63–69.
- M. Amin Rahimian (31 papers)
- Fang-Yi Yu (18 papers)
- Carlos Hurtado (3 papers)
- Yuxin Liu (53 papers)