Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
124 tokens/sec
GPT-4o
8 tokens/sec
Gemini 2.5 Pro Pro
47 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

AutoGMM: Automatic and Hierarchical Gaussian Mixture Modeling in Python (1909.02688v5)

Published 6 Sep 2019 in cs.LG and stat.ML

Abstract: Background: Gaussian mixture modeling is a fundamental tool in clustering, as well as discriminant analysis and semiparametric density estimation. However, estimating the optimal model for any given number of components is an NP-hard problem, and estimating the number of components is in some respects an even harder problem. Findings: In R, a popular package called mclust addresses both of these problems. However, Python has lacked such a package. We therefore introduce AutoGMM, a Python algorithm for automatic Gaussian mixture modeling, and its hierarchical version, HGMM. AutoGMM builds upon scikit-learn's AgglomerativeClustering and GaussianMixture classes, with certain modifications to make the results more stable. Empirically, on several different applications, AutoGMM performs approximately as well as mclust, and sometimes better. Conclusions: AutoMM, a freely available Python package, enables efficient Gaussian mixture modeling by automatically selecting the initialization, number of clusters and covariance constraints.

Citations (8)

Summary

We haven't generated a summary for this paper yet.