Making Privacy-preserving Federated Graph Analytics with Strong Guarantees Practical (for Certain Queries) (2404.01619v1)
Abstract: Privacy-preserving federated graph analytics is an emerging area of research. The goal is to run graph analytics queries over a set of devices that are organized as a graph while keeping the raw data on the devices rather than centralizing it. Further, no entity may learn any new information except for the final query result. For instance, a device may not learn a neighbor's data. The state-of-the-art prior work for this problem provides privacy guarantees for a broad set of queries in a strong threat model where the devices can be malicious. However, it imposes an impractical overhead: each device locally requires over 8.79 hours of cpu time and 5.73 GiBs of network transfers per query. This paper presents Colo, a new, low-cost system for privacy-preserving federated graph analytics that requires minutes of cpu time and a few MiBs in network transfers, for a particular subset of queries. At the heart of Colo is a new secure computation protocol that enables a device to securely and efficiently evaluate a graph query in its local neighborhood while hiding device data, edge data, and topology data. An implementation and evaluation of Colo shows that for running a variety of COVID-19 queries over a population of 1M devices, it requires less than 8.4 minutes of a device's CPU time and 4.93 MiBs in network transfers - improvements of up to three orders of magnitude.
- Clustering and superspreading potential of SARS-CoV-2 infections in Hong Kong. Nature Medicine, 2020.
- Anthem is warning consumers about its huge data breach. https://www.latimes.com/business/la-fi-mh-anthem-is-warning-consumers-20150306-column.html, 2015.
- Secure graph analysis at scale. In ACM Conference on Computer and Communications Security (CCS), 2021.
- arkworks. ark-groth16. https://github.com/arkworks-rs/groth16.
- AWS. Amazon EC2 On-Demand Pricing. https://aws.amazon.com/ec2/pricing/on-demand/.
- AWS. Compute Savings Plans - Amazon Web Services. https://aws.amazon.com/savingsplans/compute-pricing/.
- Azure. Pricing - Bandwidth — Microsoft Azure. https://azure.microsoft.com/en-us/pricing/details/bandwidth/.
- SMCQL: Secure querying for federated databases. International Conference on Very Large Data Bases (VLDB), 2017.
- Shrinkwrap: efficient SQL query processing in differentially private data federations. International Conference on Very Large Data Bases (VLDB), 2018.
- D. J. Bernstein. ChaCha, a variant of Salsa20. In Workshop record of SASC, 2008.
- Epidemiology and transmission of COVID-19 in 391 cases and 1286 of their close contacts in Shenzhen, China: A retrospective cohort study. The Lancet infectious diseases, 2020.
- G. R. Blakley. Safeguarding cryptographic keys. In Workshop on Managing Requirements Knowledge (MARK), 1979.
- M. Blum and S. Micali. How to generate cryptographically strong sequences of pseudo random bits. In Providing Sound Foundations for Cryptography: On the Work of Shafi Goldwasser and Silvio Micali, pages 227–240, 2019.
- Scalable multi-party computation for zk-SNARK parameters in the random beacon model. Cryptology ePrint Archive, 2017.
- Minimum disclosure proofs of knowledge. Journal of computer and system sciences, 1988.
- Bulletproofs: Short proofs for confidential transactions and more. In IEEE Symposium on Security and Privacy (S&P), pages 315–334, 2018.
- Centers for Disease Control and Prevention. https://www.cdc.gov/.
- Revisiting Actor Programming in C++. Computer Languages, Systems & Structures, 45:105–131, April 2016.
- D. L. Chaum. Untraceable electronic mail, return addresses, and digital pseudonyms. Communications of the ACM, 24(2):84–90, 1981.
- T. Chou and C. Orlandi. The simplest protocol for oblivious transfer. In International Conference on Cryptology and Information Security in Latin America, 2015.
- Bulletproofs+: Shorter proofs for a privacy-enhanced distributed ledger. IEEE Access, 10:42067–42082, 2022.
- CovidSafe. https://covidsafe.cs.washington.edu/.
- Social encounter networks: Characterizing Great Britain. Proceedings of the Royal Society B: Biological Sciences, 2013.
- Publishing graph degree distribution with node differential privacy. In ACM SIGMOD, 2016.
- Privacy-preserving triangle counting in large graphs. In The Conference on Information and Knowledge Management (CIKM), 2018.
- GraphShield: Dynamic large graphs for secure queries with forward privacy. IEEE Transactions on Knowledge and Data Engineering, 34(7):3295–3308, 2022.
- J. Fan and F. Vercauteren. Somewhat practical fully homomorphic encryption. IACR Cryptol. ePrint Arch., 2012.
- Towards privacy for social networks: A zero-knowledge based definition of privacy. In Theory of Cryptography Conference (TCC), 2011.
- The knowledge complexity of interactive proof-systems. In ACM Symposium on Theory of Computing (STOC), 1985.
- GraphX: Graph processing in a distributed dataflow framework. In USENIX Symposium on Operating Systems Design and Implementation (OSDI), 2014.
- Poseidon: A new hash function for Zero-Knowledge proof systems. In USENIX Security Symposium, 2021.
- J. Groth. On the size of pairing-based non-interactive arguments. In Annual International Conference on the Theory and Applications of Cryptographic Techniques (EUROCRYPT), pages 305–326, 2016.
- Spread of SARS-CoV-2 in the Icelandic population. New England Journal of Medicine, 2020.
- Poster: Privacy-preserving epidemiological modeling on mobile graphs. In Proceedings of the ACM SIGSAC Conference on Computer and Communications Security, 2022.
- Privacy-preserving epidemiological modeling on mobile graphs. arXiv preprint arXiv:2206.00539, 2022.
- Scape: Scalable collaborative analytics system on private database with malicious security. In IEEE International Conference on Data Engineering (ICDE), 2022.
- libsodium. https://github.com/jedisct1/libsodium.
- Household secondary attack rate of COVID-19 and associated determinants in Guangzhou, China: A retrospective cohort study. The Lancet Infectious Diseases, 2020.
- Graphse2: An encrypted graph database for privacy-preserving social search. In ACM ASIA Conference on Computer and Communications Security (CCS), 2019.
- Epidemiology and transmission dynamics of COVID-19 in two Indian states. Science, 2020.
- Karaoke: Distributed private messaging immune to passive traffic analysis. In USENIX Symposium on Operating Systems Design and Implementation (OSDI), pages 711–725, 2018.
- Yodel: Strong metadata security for voice calls. In ACM Symposium on Operating Systems Principles (SOSP), 2019.
- J. Leskovec and A. Krevl. SNAP Datasets: Stanford large network dataset collection. http://snap.stanford.edu/data, June 2014.
- Flare: A fast, secure, and memory-efficient distributed analytics framework. International Conference on Very Large Data Bases (VLDB), 2023.
- Private graph data release: A survey. ACM Computing Surveys, 2023.
- Y. Lindell and B. Pinkas. An efficient protocol for secure two-party computation in the presence of malicious adversaries. In Annual International Conference on the Theory and Applications of Cryptographic Techniques (EUROCRYPT), 2007.
- Y. Lindell and B. Pinkas. Secure two-party computation via cut-and-choose oblivious transfer. Journal of Cryptology, 2012.
- Enabling privacy-preserving shortest distance queries on encrypted graph data. IEEE Transactions on Dependable and Secure Computing, 2018.
- Approximate shortest distance queries with advanced graph analytics over large-scale encrypted graphs. In International Conference on Mobility, Sensing and Networking (MSN), 2022.
- Neptune. https://github.com/lurk-lab/neptune.
- D. Mansy and P. Rindal. Endemic oblivious transfer. In Proceedings of the ACM SIGSAC Conference on Computer and Communications Security, pages 309–326, 2019.
- S. Mazloom and S. D. Gordon. Secure computation with differentially private access patterns. In ACM Conference on Computer and Communications Security (CCS), 2018.
- Secure parallel computation on national scale volumes of data. In USENIX Security Symposium, 2020.
- F. D. McSherry. Privacy integrated queries: An extensible platform for privacy-preserving data analysis. In ACM SIGMOD, 2009.
- P. Mohassel and B. Riva. Garbled circuits checking garbled circuits: More efficient and secure two-party computation. In Advances in Cryptology—CRYPTO, 2013.
- Social contacts and mixing patterns relevant to the spread of infectious diseases. PLoS medicine, 2008.
- A. Narayan and A. Haeberlen. DJoin: Differentially private join queries over distributed databases. In USENIX Symposium on Operating Systems Design and Implementation (OSDI), 2012.
- GraphSC: Parallel secure computation made easy. In IEEE Symposium on Security and Privacy (S&P), 2015.
- K. Newatia. Mycelium. https://github.com/karannewatia/Mycelium.
- Transmission of Nipah virus—14 years of investigations in Bangladesh. New England Journal of Medicine, 2019.
- Actively secure 1-out-of-N OT extension with application to private set intersection. Cryptology ePrint Archive, Paper 2016/933, 2016.
- Dstress: Efficient differentially private computations on distributed data. In ACM European Conference on Computer Systems (EuroSys), 2017.
- Contact tracing during coronavirus disease outbreak, South Korea, 2020. Emerging infectious diseases, 2020.
- L. R. Peter Rindal. libOTe: an efficient, portable, and easy to use Oblivious Transfer Library. https://github.com/osu-crypto/libOTe.
- Senate: A maliciously-secure MPC platform for collaborative analytics. In USENIX Security Symposium, 2021.
- Premera Blue Cross Says Data Breach Exposed Medical Data. https://www.nytimes.com/2015/03/18/business/premera-blue-cross-says-data-breach-exposed-medical-data.html, 2015.
- M. O. Rabin. How to exchange secrets with oblivious transfer. Cryptology ePrint Archive, 2005.
- Recent Cyber Attacks & Data Breaches In 2023. https://purplesec.us/security-insights/data-breaches/.
- On data banks and privacy homomorphisms. Foundations of secure computation, 1978.
- Mycelium: Large-scale distributed graph queries with differential privacy. In ACM Symposium on Operating Systems Principles (SOSP), 2021.
- L. Roy. SoftSpokenOT: Communication-computation tradeoffs in OT extension. Cryptology ePrint Archive, Paper 2022/192, 2022.
- A. Shamir. How to share a secret. Communications of the ACM, 22(11):612–613, 1979.
- PrivateGraph: Privacy-preserving spectral analysis of encrypted graphs in the cloud. IEEE Transactions on Knowledge and Data Engineering, 2018.
- SwissCovid. https://www.bag.admin.ch/bag/en/home/krankheiten/ausbrueche-epidemien-pandemien/aktuelle-ausbrueche-epidemien/novel-cov/swisscovid-app-und-contact-tracing.html.
- TraceTogether. https://www.tracetogether.gov.sg/.
- I. Vojinovic. Data breach statistics that will make you think twice before filling out an online form. https://dataprot.net/statistics/data-breach-statistics/.
- OblivGM: Oblivious attributed subgraph matching as a cloud service. IEEE Transactions on Information Forensics and Security, 2022.
- Mago: Maliciously secure subgraph counting on decentralized social graphs. IEEE Transactions on Information Forensics and Security, 2023.
- Pegraph: A system for privacy-preserving and efficient search over encrypted social graphs. IEEE Transactions on Information Forensics and Security, 2022.
- Privacy-preserving analytics on decentralized social graphs: The case of eigendecomposition. IEEE Transactions on Knowledge and Data Engineering, 2022.
- P. Xie and E. Xing. CryptGraph: Privacy preserving graph analytics on encrypted graph. arXiv preprint arXiv:1409.5021, 2014.
- M. Yagisawa. Fully homomorphic encryption without bootstrapping. Cryptology ePrint Archive, 2015.
- A. C. Yao. Protocols for secure computations. In Annual Symposium on Foundations of Computer Science (SFCS), pages 160–164, 1982.
- A. C. Yao. Theory and application of trapdoor functions. In Annual Symposium on Foundations of Computer Science (SFCS), 1982.
- A. C. Yao. How to generate and exchange secrets. In Annual Symposium on Foundations of Computer Science (SFCS), 1986.
- ZoKrates. https://zokrates.github.io/.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.