Differentially Private Decentralized Learning with Random Walks (2402.07471v2)

Published 12 Feb 2024 in cs.LG and cs.CR

Abstract: The popularity of federated learning comes from the possibility of better scalability and the ability for participants to keep control of their data, improving data security and sovereignty. Unfortunately, sharing model updates also creates a new privacy attack surface. In this work, we characterize the privacy guarantees of decentralized learning with random walk algorithms, where a model is updated by traveling from one node to another along the edges of a communication graph. Using a recent variant of differential privacy tailored to the study of decentralized algorithms, namely Pairwise Network Differential Privacy, we derive closed-form expressions for the privacy loss between each pair of nodes where the impact of the communication topology is captured by graph theoretic quantities. Our results further reveal that random walk algorithms tends to yield better privacy guarantees than gossip algorithms for nodes close from each other. We supplement our theoretical results with empirical evaluation on synthetic and real-world graphs and datasets.

Citations (2)

View on Semantic Scholar

Summary

The paper introduces a private random walk SGD algorithm that leverages PNDP to derive closed-form privacy loss expressions between node pairs in arbitrary graphs.
The paper provides theoretical convergence rates for strongly convex loss functions under a Markovian sampling framework, showing competitiveness against gossip algorithms.
The paper offers extensive empirical evaluations on synthetic and real-world data, highlighting privacy-utility trade-offs and superior performance for closely connected nodes.

Differential Privacy Guarantees in Decentralized Learning via Random Walks

Introduction to the Study

The paper by Edwige Cyffers, Aurélien Bellet, and Jalaj Upadhyay examines the privacy guarantees of decentralized learning using random walk algorithms. They leverage a recent variant of differential privacy tailored for decentralized algorithms, namely Pairwise Network Differential Privacy (PNDP), to derive closed-form expressions capturing the privacy loss between node pairs. This approach contrasts with gossip algorithms and hinges on the communication graph's topology, affecting privacy guarantees. The paper also includes theoretical convergence rates for a privacy-preserving version of stochastic gradient descent (SGD) and empirical evaluations to underscore the practical relevance of the theoretical findings.

Related Work and Background

The research explores decentralized learning, which omits a central server for peer-to-peer communication among nodes, thereby enhancing scalability and data sovereignty. However, this raises privacy concerns as model parameter sharing might indirectly leak sensitive information. The paper emphasizes the importance of differential privacy (DP), particularly in the decentralized setting where fully decentralized algorithms, including random walk algorithms and gossip algorithms, play vital roles. Prior works have proposed approaches to mitigate privacy risks in decentralized optimization, suggesting divisions based on local DP and more specialized privacy models for decentralized settings, like Network Differential Privacy (NDP) and its pairwise variant (PNDP).

Contributions

The key contributions of this work are fourfold:

Introduction of a private version of random walk SGD for arbitrary graphs,
Establishment of its convergence rate for strongly convex loss functions,
Derivation of closed-form expressions for privacy loss between node pairs factoring in the communication topology,
Theoretical and empirical comparison of this approach against gossip algorithms, revealing superior privacy utility trade-offs in certain regimes.

Main Theoretical Insights

One of the paper’s novel theoretical achievements is the closed-form expressions for privacy loss in decentralized learning scenarios orchestrated by random walk algorithms. These expressions consider pairwise node interactions and the overarching communication topology, captured through graph-theoretic quantities. The convergence rate for the proposed private version of random walk SGD is grounded on recent results under Markovian sampling, showing competitiveness with the gossip SGD counterpart while balancing noise levels and privacy threats as per the PNDP framework.

Empirical Evaluations

The empirical evidence presented evaluates privacy loss on synthetic and real-world datasets and graphs, highlighting scenarios where random walks offer better privacy guarantees than gossip algorithms. Particularly, experiments involving logistic regression on datasets like the UCI Housing dataset illustrate the privacy-utility trade-offs, demonstrating that the random walk approach can significantly reduce privacy loss for closer nodes, with benefits dimming as node distance increases.

Future Directions and Speculations

The exploration into differential privacy in decentralized learning sets the stage for further research into optimizing privacy guarantees with respect to network topology, communication costs, and algorithmic efficiency. Future work might expand the analysis to non-convex settings, explore varying degrees of data heterogeneity across nodes, or integrate secure multi-party computation techniques to bolster privacy in presence of adversarial threats within the network.

Conclusion

This research underscores the potential of random walk algorithms in decentralized learning to enhance privacy protections under the PNDP model. By elucidating the relationship between privacy loss and network topology, the paper contributes to deeper understandings of decentralized algorithms' privacy implications and opens avenues for more secure, efficient, and scalable decentralized learning systems.

PDF Markdown

Related Papers

Tweets

https://twitter.com/jalajupadhyay/status/1757759133518831931