Papers
Topics
Authors
Recent
Search
2000 character limit reached

Bayesian variable selection for latent class analysis using a collapsed Gibbs sampler

Published 27 Feb 2014 in stat.CO | (1402.6928v2)

Abstract: Latent class analysis is used to perform model based clustering for multivariate categorical responses. Selection of the variables most relevant for clustering is an important task which can affect the quality of clustering considerably. This work considers a Bayesian approach for selecting the number of clusters and the best clustering variables. The main idea is to reformulate the problem of group and variable selection as a probabilistically driven search over a large discrete space using Markov chain Monte Carlo (MCMC) methods. This approach results in estimates of degree of relevance of each variable for clustering along with posterior probability for the number of clusters. Bayes factors can then be easily calculated, and a suitable model chosen in a principled manner. Both selection tasks are carried out simultaneously using an MCMC approach based on a collapsed Gibbs sampling method, whereby several model parameters are integrated from the model, substantially improving computational performance. Approaches for estimating posterior marginal probabilities of class membership, variable inclusion and number of groups are proposed, and post-hoc procedures for parameter and uncertainty estimation are outlined. The approach is tested on simulated and real data.

Citations (47)

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.