Scaling up Dynamic Edge Partition Models via Stochastic Gradient MCMC (2403.00044v1)
Abstract: The edge partition model (EPM) is a generative model for extracting an overlapping community structure from static graph-structured data. In the EPM, the gamma process (GaP) prior is adopted to infer the appropriate number of latent communities, and each vertex is endowed with a gamma distributed positive memberships vector. Despite having many attractive properties, inference in the EPM is typically performed using Markov chain Monte Carlo (MCMC) methods that prevent it from being applied to massive network data. In this paper, we generalize the EPM to account for dynamic enviroment by representing each vertex with a positive memberships vector constructed using Dirichlet prior specification, and capturing the time-evolving behaviour of vertices via a Dirichlet Markov chain construction. A simple-to-implement Gibbs sampler is proposed to perform posterior computation using Negative- Binomial augmentation technique. For large network data, we propose a stochastic gradient Markov chain Monte Carlo (SG-MCMC) algorithm for scalable inference in the proposed model. The experimental results show that the novel methods achieve competitive performance in terms of link prediction, while being much faster.
- Ayan Acharya et al. A dual Markov chain topic model for dynamic environments. In KDD, pages 1099–1108, 2018.
- Edoardo M. Airoldi et al. Mixed membership stochastic blockmodels. J. Mach. Learn. Res., 9:1981–2014, June 2008.
- Arnab Bhadury et al. Scaling up dynamic topic models. In WWW, pages 381–390, 2016.
- Antoine Bordes et al. Learning structured embeddings of knowledge bases. In AAAI, pages 301–306, 2011.
- Laurent Charlin et al. Dynamic Poisson factorization. In RecSys, pages 155–162, 2015.
- Changyou Chen et al. Stochastic gradient mcmc with Stale gradients. In NIPS, pages 2937–2945, 2016.
- Tianqi Chen et al. Stochastic Gradient Hamiltonian Monte Carlo. In ICML, pages 1683–1691, 2014.
- Yulai Cong et al. Deep latent Dirichlet allocation with topic-layer-adaptive stochastic gradient Riemannian MCMC. In ICML, pages 864–873, 2017.
- Yulai Cong et al. Fast simulation of hyperplane-truncated multivariate normal distributions. Bayesian Analysis, 12(4):1017–1037, 12 2017.
- Nan Ding et al. Bayesian sampling using stochastic gradient thermostats. In NIPS, pages 3203–3211, 2014.
- Trong Dinh Thac Do and Longbing Cao. Coupled Poisson factorization integrated with user/item metadata for modeling popular and sparse ratings in scalable recommendation. In AAAI, pages 2918–2925, 2018.
- Daniel M. Dunlavy et al. Temporal link prediction using matrix and tensor factorizations. ACM Trans. Knowl. Discov. Data, 5(2):10:1–10:27, 2011.
- David B. Dunson et al. Bayesian latent variable models for mixed discrete outcomes. Biostatistics, 6(1):11–25, 2005.
- James R. Foulds et al. A dynamic relational infinite feature model for longitudinal networks. In AISTATS, pages 287–295, 2011.
- Wenjie Fu et al. Dynamic mixed membership blockmodel for evolving networks. In ICML, pages 329–336, 2009.
- M. Girvan and M. E. J. Newman. Community structure in social and biological networks. Proceedings of the National Academy of Sciences, 99(12):7821–7826, 2002.
- Prem Gopalan et al. Scalable recommendation with hierarchical poisson factorization. In UAI, pages 326–335, 2015.
- Creighton Heaukulani et al. Dynamic probabilistic models for latent feature propagation in social networks. In ICML, pages 275–283, 2013.
- Qirong Ho et al. Evolving cluster mixed-membership blockmodel for time-evolving networks. In AISTATS, pages 342–350, 2011.
- Matthew D. Hoffman et al. Stochastic variational inference. JMLR, 14:1303–1347, 2013.
- Changwei Hu et al. Topic-based embeddings for learning from large knowledge graphs. In AISTATS, pages 1133–1141, 2016.
- Changwei Hu et al. Deep generative models for relational data with side information. In ICML, pages 1578–1586, 2017.
- Myunghwan Kim et al. Nonparametric multi-group membership model for dynamic networks. In NIPS, pages 1385–1393, 2013.
- Chunyuan Li et al. Preconditioned stochastic gradient Langevin dynamics for deep neural networks. In AAAI, pages 1788–1794, 2016.
- Wenzhe Li et al. Scalable mcmc for mixed membership stochastic blockmodels. In AISTATS, pages 723–731, Cadiz, Spain, 2016.
- Yi-An Ma et al. A complete recipe for stochastic gradient MCMC. In NIPS, pages 2917–2925, 2015.
- Rossana Mastrandrea et al. Contact patterns in a high school: A comparison between data collected using wearable sensors. PLoS ONE, 10(9):1–26, 2015.
- Kurt Miller et al. Nonparametric latent feature models for link prediction. In NIPS, pages 1276–1284, 2009.
- Maximilian Nickel et al. A review of relational machine learning for knowledge graphs. Proc. of the IEEE, 104(1):11–33, 2016.
- Iku Ohama et al. On the model shrinkage effect of gamma process edge partition models. In NIPS, pages 397–405, 2017.
- Sam Patterson et al. Stochastic gradient Riemannian Langevin dynamics on the probability simplex. In NIPS, pages 3102–3110, 2013.
- J. Pitman. Combinatorial stochastic processes. Springer-Verlag, Berlin, 2006. Lectures on Probability Theory.
- Nicholas G. Polson et al. Bayesian inference for logistic models using Pólya–Gamma latent variables. JASA, 108(504):1339–1349, 2013.
- Piyush Rai. Non-negative inductive matrix completion for discrete dyadic data. In AAAI, pages 2499–2505, San Francisco, USA, 2017.
- Garry Robins et al. An introduction to exponential random graph (p*) models for social networks. Social Networks, 29(2):173–191, May 2007.
- Aaron Schein et al. Bayesian Poisson Tucker decomposition for learning the structure of international relations. In ICML, pages 2810–2819, 2016.
- Aaron Schein et al. Locally private Bayesian inference for count models. CoRR, abs/1803.08471, 2018.
- Scalable Bayes via barycenter in wasserstein space. Journal of Machine Learning Research, 19(8):1–35, 2018.
- Yi Tay et al. Non-parametric estimation of multiple embeddings for link prediction on dynamic knowledge graphs. In AAAI, pages 1243–1249, 2017.
- Eric P. Xing et al. A state-space mixed membership blockmodel for dynamic network tomography. Ann. Appl. Stat., 4(2):535–566, 06 2010.
- Kevin S. Xu et al. Dynamic stochastic blockmodels for time-evolving social networks. J. Sel. Topics Signal Processing, 8(4):552–562, 2014.
- Collapsed variational inference for nonparametric Bayesian group factor analysis. In 2018 IEEE International Conference on Data Mining (ICDM), pages 687–696, 2018.
- Dependent relational gamma process models for longitudinal networks. In Proceedings of the International Conference on Machine Learning (ICML), pages 5551–5560, 2018.
- A Poisson gamma probabilistic model for latent node-group memberships in dynamic networks. In AAAI, pages 4366–4373, 2018.
- The Hawkes edge partition model for continuous-time event-based temporal networks. In Proceedings of the 36th Conference on Uncertainty in Artificial Intelligence (UAI), pages 460–469, 2020.
- Estimating latent population flows from aggregated data via inversing multi-marginal optimal transport. In Proceedings of the 2023 SIAM International Conference on Data Mining (SDM), pages 181–189, 2023.
- Tianbao Yang et al. Detecting communities and their evolutions in dynamic social networks - a Bayesian approach. Machine Learning, 82(2):157–189, 2011.
- He Zhao et al. Leveraging node attributes for incomplete relational data. In ICML, pages 4072–4081, 2017.
- Mingyuan Zhou. Beta-negative binomial process and exchangeable random partitions for mixed-membership modeling. In NIPS, pages 3455–3463, 2014.
- Mingyuan Zhou. Infinite edge partition models for overlapping community detection and link prediction. In AISTATS, pages 1135–1143, 2015.
- Mingyuan Zhou. Nonparametric Bayesian negative binomial factor analysis. Bayesian Analysis, pages 1–29, 2018.
- Augment-and-conquer negative binomial processes. In NIPS, pages 2555–2563, 2012.
- Negative binomial process count and mixture modeling. IEEE Trans. PAMI, 37(2):307–320, 2015.
- Mingyuan Zhou et al. Augmentable gamma belief networks. Journal of Machine Learning Research, 17:1–44, 2016.