Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
153 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Parallelizing MCMC with Machine Learning Classifier and Its Criterion Based on Kullback-Leibler Divergence (2406.11246v2)

Published 17 Jun 2024 in stat.CO

Abstract: In the era of Big Data, Markov chain Monte Carlo (MCMC) methods, which are currently essential for Bayesian estimation, face significant computational challenges owing to their sequential nature. To achieve a faster and more effective parallel computation, we emphasize the critical role of the overlapped area of the posterior distributions based on partitioned data, which we term the reconstructable area. We propose a method that utilizes machine learning classifiers to effectively identify and extract MCMC draws obtained by parallel computations from the area based on posteriors based on partitioned sub-datasets, approximating the target posterior distribution based on the full dataset. This study also develops a Kullback-Leibler (KL) divergence-based criterion. It does not require calculating the full-posterior density and can be calculated using only information from the sub-posterior densities, which are generally obtained after implementing MCMC. This simplifies the hyperparameter tuning in training classifiers. The simulation studies validated the efficacy of the proposed method. This approach contributes to ongoing research on parallelizing MCMC methods and may offer insights for future developments in Bayesian computation for large-scale data analyses.

Summary

We haven't generated a summary for this paper yet.