Distributed Markov Chain Monte Carlo Sampling based on the Alternating Direction Method of Multipliers (2401.15838v1)
Abstract: Many machine learning applications require operating on a spatially distributed dataset. Despite technological advances, privacy considerations and communication constraints may prevent gathering the entire dataset in a central unit. In this paper, we propose a distributed sampling scheme based on the alternating direction method of multipliers, which is commonly used in the optimization literature due to its fast convergence. In contrast to distributed optimization, distributed sampling allows for uncertainty quantification in Bayesian inference tasks. We provide both theoretical guarantees of our algorithm's convergence and experimental evidence of its superiority to the state-of-the-art. For our theoretical results, we use convex optimization tools to establish a fundamental inequality on the generated local sample iterates. This inequality enables us to show convergence of the distribution associated with these iterates to the underlying target distribution in Wasserstein distance. In simulation, we deploy our algorithm on linear and logistic regression tasks and illustrate its fast convergence compared to existing gradient-based methods.
- Distributed Stochastic Gradient MCMC. In International Conference on Machine Learning, 2014.
- Federated Learning via Posterior Averaging: A New Perspective and Practical Algorithms. In International Conference on Learning Representations, 2020.
- An Introduction to MCMC for Machine Learning. Machine Learning, 50:5–43, 2003.
- Convex Analysis and Monotone Operator Theory in Hilbert Spaces. Springer, 2011.
- Parallel and Distributed Computation: Numerical Methods. Athena Scientific, 2015.
- Distributed Event-Triggered Unadjusted Langevin Algorithm for Bayesian Learning. Automatica, 156:111221, 2023.
- Variational Inference: A Review for Statisticians. Journal of the American Statistical Association, 112(518):859–877, 2017.
- Is ADMM Always Faster than Average Consensus? Automatica, 91:311–315, 2018.
- Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers, volume 3. Now Publishers, Inc., 2011.
- Convex Optimization. Cambridge University Press, 2004.
- Signless Laplacians of Finite Graphs. Linear Algebra and its Applications, 423(1):155–171, 2007.
- George Dantzig. Linear Programming and Extensions. Princeton University Press, 1963.
- On Convergence of Federated Averaging Langevin Dynamics. arXiv preprint arXiv:2112.05120, 2022.
- CVXPY: A Python-Embedded Modeling Language for Convex Optimization. Journal of Machine Learning Research, 17(83):1–5, 2016.
- Efficient Bayesian Computation by Proximal Markov Chain Monte Carlo: When Langevin Meets Moreau. SIAM Journal on Imaging Sciences, 11(1):473–506, 2018.
- A Class of Wasserstein Metrics for Probability Distributions. Michigan Mathematical Journal, 31(2):231–240, 1984.
- Decentralized Stochastic Gradient Langevin Dynamics and Hamiltonian Monte Carlo. Journal of Machine Learning Research, 22(1):10804–10872, 2021.
- ELF: Federated Langevin Algorithms with Primal, Dual and Bidirectional Compression. arXiv preprint arXiv:2303.04622, 2023.
- Decentralized Bayesian Learning with Metropolis-Adjusted Hamiltonian Monte Carlo. Machine Learning, pages 1–29, 2023.
- Decentralized Bayesian Learning over Graphs. arXiv preprint arXiv:1905.10466, 2019.
- On the Convergence of FedAvg on Non-IID Data. In International Conference on Learning Representations, 2019.
- Distributed Sparse Linear Regression. IEEE Transactions on Signal Processing, 58(10):5262–5276, 2010.
- Distributed Subgradient Methods for Multi-Agent Optimization. IEEE Transactions on Automatic Control, 54(1):48–61, 2009.
- Asymptotically Exact, Embarrassingly Parallel MCMC. In Conference on Uncertainty in Artificial Intelligence, pages 623–632, 2014.
- A Decentralized Approach to Bayesian Learning. arXiv preprint arXiv:2007.06799, 2021.
- Proximal Algorithms, volume 1. Now Publishers, Inc., 2014.
- Marcelo Pereyra. Proximal Markov Chain Monte Carlo Algorithms. Statistics and Computing, 26:745–760, 2016.
- Global Consensus Monte Carlo. Journal of Computational and Graphical Statistics, 30(2):249–259, 2020.
- Large-Scale Convex Optimization: Algorithms & Analyses via Monotone Operators. Cambridge University Press, 2022.
- Stochastic Proximal Langevin Algorithm: Potential Splitting and Nonasymptotic Rates. Advances in Neural Information Processing Systems, 32, 2019.
- Inass Sekkat. Large Scale Bayesian Inference. PhD thesis, Ècole des Ponts ParisTech, 2022.
- On the Linear Convergence of the ADMM in Decentralized Consensus Optimization. IEEE Transactions on Signal Processing, 62(7):1750–1761, 2014.
- Distributed Optimization Methods for Multi-Robot Systems: Part I – A Tutorial. arXiv preprint arXiv:2301.11313, 2023a.
- Distributed Optimization Methods for Multi-Robot Systems: Part II – A Survey. arXiv preprint arXiv:2301.11361, 2023b.
- The Proximal Robbins–Monro Method. Journal of the Royal Statistical Society Series B: Statistical Methodology, 83(1):188–212, 2021.
- The Wasserstein Distances. Optimal Transport: Old and New, pages 93–111, 2009.
- Split-and-Augmented Gibbs Sampler—Application to Large-Scale Inference Problems. IEEE Transactions on Signal Processing, 67(6):1648–1661, 2019.
- Efficient MCMC Sampling with Dimension-Free Convergence Rate using ADMM-type Splitting. Journal of Machine Learning Research, 23(1):1100–1168, 2022.
- Bayesian Learning via Stochastic Gradient Langevin Dynamics. In International Conference on Machine Learning, pages 681–688, 2011.
- DINNO: Distributed Neural Network Optimization for Multi-Robot Collaborative Learning. IEEE Robotics and Automation Letters, 7(2):1896–1903, 2022.
- Stochastic Modified Equations for Continuous Limit of Stochastic ADMM. arXiv preprint arXiv:2003.03532, 2020.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.