Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Simultaneous Identification of Sparse Structures and Communities in Heterogeneous Graphical Models (2405.09841v1)

Published 16 May 2024 in stat.ML and cs.LG

Abstract: Exploring and detecting community structures hold significant importance in genetics, social sciences, neuroscience, and finance. Especially in graphical models, community detection can encourage the exploration of sets of variables with group-like properties. In this paper, within the framework of Gaussian graphical models, we introduce a novel decomposition of the underlying graphical structure into a sparse part and low-rank diagonal blocks (non-overlapped communities). We illustrate the significance of this decomposition through two modeling perspectives and propose a three-stage estimation procedure with a fast and efficient algorithm for the identification of the sparse structure and communities. Also on the theoretical front, we establish conditions for local identifiability and extend the traditional irrepresentability condition to an adaptive form by constructing an effective norm, which ensures the consistency of model selection for the adaptive $\ell_1$ penalized estimator in the second stage. Moreover, we also provide the clustering error bound for the K-means procedure in the third stage. Extensive numerical experiments are conducted to demonstrate the superiority of the proposed method over existing approaches in estimating graph structures. Furthermore, we apply our method to the stock return data, revealing its capability to accurately identify non-overlapped community structures.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (55)
  1. Pseudo-likelihood methods for community detection in large sparse networks1. The Annals of Statistics, 41(4):2097–2122.
  2. On semidefinite relaxations for the block model. The Annals of Statistics, 46(1):149–179.
  3. Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends® in Machine learning, 3(1):1–122.
  4. Convex optimization. Cambridge university press.
  5. Simple bounds for recovering low-complexity models. Mathematical Programming, 141(1-2):577–589.
  6. Robust principal component analysis? Journal of the ACM (JACM), 58(3):1–37.
  7. Latent variable graphical model selection via convex optimization. The Annals of Statistics, 40(4):1935 – 1967.
  8. Rank-sparsity incoherence for matrix decomposition. SIAM Journal on Optimization, 21(2):572–596.
  9. A fused latent and graphical model for multivariate binary data. arXiv preprint arXiv:1606.08925.
  10. The joint graphical LASSO for inverse covariance estimation across multiple classes. Journal of the Royal Statistical Society Series B: Statistical Methodology, 76(2):373–397.
  11. Introduction to Hilbert spaces with applications. Academic press.
  12. High-dimensional inference for cluster-based graphical models. Journal of Machine Learning Research, 21(53).
  13. Network exploration via the adaptive LASSO and SCAD penalties. The Annals of Applied Statistics, 3(2):521 – 541.
  14. An ℓ∞subscriptℓ\ell_{\infty}roman_ℓ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT eigenvector perturbation bound and its application to robust covariance estimation. Journal of Machine Learning Research, 18(207):1–42.
  15. Sparse inverse covariance estimation with the graphical LASSO. Biostatistics, 9(3):432–441.
  16. Estimating heterogeneous graphical models for discrete data with an application to roll call voting. The Annals of Applied Statistics, 9(2):821.
  17. Joint estimation of multiple graphical models. Biometrika, 98(1):1–15.
  18. Simultaneous clustering and estimation of heterogeneous graphical models. Journal of Machine Learning Research, 18(217):1–58.
  19. The elements of statistical learning: data mining, inference, and prediction, volume 2. Springer.
  20. Stochastic blockmodels: First steps. Social Networks, 5(2):109–137.
  21. Robust matrix decomposition with sparse corruptions. IEEE Transactions on Information Theory, 57(11):7221–7234.
  22. Adaptive LASSO for sparse high-dimensional regression models. Statistica Sinica, 18(4):1603–1618.
  23. Jin, J. (2015). Fast community detection by score. The Annals of Statistics, 43(1):57–89.
  24. Lauritzen, S. L. (1996). Graphical models, volume 17. Clarendon Press.
  25. Lee, J. M. (2012). Smooth manifolds. Springer.
  26. Consistency of spectral clustering in stochastic block models. The Annals of Statistics, 43(1):215–237.
  27. Sparse estimation of conditional graphical models with application to gene networks. Journal of the American Statistical Association, 107(497):152–167.
  28. A nonparametric graphical model for functional data with application to brain networks based on fmri. Journal of the American Statistical Association, 113(524):1637–1655.
  29. Convex relaxation methods for community detection. Statistical Science, 36(1):2–15.
  30. Inter-subject analysis: A partial Gaussian graphical model approach. Journal of the American Statistical Association, 116(534):746–755.
  31. Alternating direction methods for latent variable Gaussian graphical model selection. Neural Computation, 25(8):2172–2198.
  32. Universal latent space model fitting for large networks with edge covariates. Journal of Machine Learning Research, 21(1):86–152.
  33. High-dimensional graphs and variable selection with the LASSO. The Annals of Statistics, 34(3):1436–1462.
  34. A unified framework for high-dimensional analysis of M-estimators with decomposable regularizers. Statistical Science, 27(4):538 – 557.
  35. Covariate-assisted Bayesian graph learning for heterogeneous data. Journal of the American Statistical Association, 0(0):1–15.
  36. Simultaneously structured models with application to sparse and low-rank matrices. IEEE Transactions on Information Theory, 61(5):2886–2908.
  37. Community-based group graphical LASSO. Journal of Machine Learning Research, 21(1):2406–2437.
  38. High-dimensional covariance estimation by minimizing ℓ1subscriptℓ1\ell_{1}roman_ℓ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT-penalized log-determinant divergence. Electronic Journal of Statistics, 5:935 – 980.
  39. Estimation of simultaneously sparse and low-rank matrices. In Proceedings of the 29th International Conference on International Conference on Machine Learning, pages 51–58.
  40. Learning graphical models with hubs. Journal of Machine Learning Research, 15(1):3297–3331.
  41. Sparse reduced-rank huber regression in high dimensions. Journal of the American Statistical Association, 118(544):2383–2393.
  42. The cluster graphical LASSO for improved estimation of Gaussian graphical models. Computational Statistics & Data Analysis, (85):23–36.
  43. Estimation of graphical models through structured norm minimization. Journal of Machine Learning Research, 18(1):1–48.
  44. Tibshirani, R. (1996). Regression shrinkage and selection via the LASSO. Journal of the Royal Statistical Society Series B: Statistical Methodology, 58(1):267–288.
  45. Bayesian edge regression in undirected graphical models to characterize interpatient heterogeneity in cancer. Journal of the American Statistical Association, 117(538):533–546.
  46. Watson, G. A. (1992). Characterization of the subdifferential of some matrix norms. Linear Algebra Appl, 170(1):33–45.
  47. Tree-based node aggregation in sparse graphical models. Journal of Machine Learning Research, 23(243):1–36.
  48. A useful variant of the Davis–Kahan theorem for statisticians. Biometrika, 102(2):315–323.
  49. Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society Series B: Statistical Methodology, 68(1):49–67.
  50. Model selection and estimation in the Gaussian graphical model. Biometrika, 94(1):19–35.
  51. Partial Gaussian graphical model estimation. IEEE Transactions on Information Theory, 60(3):1673–1687.
  52. High-dimensional Gaussian graphical regression models with covariates. Journal of the American Statistical Association, 118(543):2088–2100.
  53. On model selection consistency of LASSO. Journal of Machine Learning Research, 7(90):2541–2563.
  54. Identifiability and consistent estimation for Gaussian chain graph models. Journal of the American Statistical Association, 0(0):1–12.
  55. Zou, H. (2006). The adaptive LASSO and its oracle properties. Journal of the American Statistical Association, 101(476):1418–1429.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com