A Contrastive Variational Graph Auto-Encoder for Node Clustering (2312.16830v1)
Abstract: Variational Graph Auto-Encoders (VGAEs) have been widely used to solve the node clustering task. However, the state-of-the-art methods have numerous challenges. First, existing VGAEs do not account for the discrepancy between the inference and generative models after incorporating the clustering inductive bias. Second, current models are prone to degenerate solutions that make the latent codes match the prior independently of the input signal (i.e., Posterior Collapse). Third, existing VGAEs overlook the effect of the noisy clustering assignments (i.e., Feature Randomness) and the impact of the strong trade-off between clustering and reconstruction (i.e., Feature Drift). To address these problems, we formulate a variational lower bound in a contrastive setting. Our lower bound is a tighter approximation of the log-likelihood function than the corresponding Evidence Lower BOund (ELBO). Thanks to a newly identified term, our lower bound can escape Posterior Collapse and has more flexibility to account for the difference between the inference and generative models. Additionally, our solution has two mechanisms to control the trade-off between Feature Randomness and Feature Drift. Extensive experiments show that the proposed method achieves state-of-the-art clustering results on several datasets. We provide strong evidence that this improvement is attributed to four aspects: integrating contrastive learning and alleviating Feature Randomness, Feature Drift, and Posterior Collapse.
- C. Zhang, S. Bengio, M. Hardt, B. Recht, and O. Vinyals, “Understanding deep learning requires rethinking generalization,” in ICLR, 2017.
- H. Maennel, I. M. Alabdulmohsin, I. O. Tolstikhin, R. Baldock, O. Bousquet, S. Gelly, and D. Keysers, “What do neural networks learn when trained with random labels?” NeurIPS, 2020.
- N. Mrabah, M. Bouguessa, and R. Ksantini, “Adversarial deep embedded clustering: on a better trade-off between feature randomness and feature drift,” TKDE, vol. 34, no. 4, pp. 1603–1617, 2022.
- B. Perozzi, R. Al-Rfou, and S. Skiena, “Deepwalk: Online learning of social representations,” in SIGKDD, 2014, pp. 701–710.
- A. Grover and J. Leskovec, “node2vec: Scalable feature learning for networks.” in SIGKDD, 2016, pp. 855–864.
- W. L. Hamilton, R. Ying, and J. Leskovec, “Inductive representation learning on large graphs,” in NeurIPS, 2017, pp. 1025–1035.
- T. N. Kipf and M. Welling, “Variational graph auto-encoders,” in NeurIPS workshop, vol. 32, 2016, pp. 1–3.
- A. Garcia Duran and M. Niepert, “Learning graph representations with embedding propagation,” in NeurIPS, vol. 30, 2017, pp. 5119–5130.
- P. Velickovic, W. Fedus, W. L. Hamilton, P. Liò, Y. Bengio, and R. D. Hjelm, “Deep graph infomax,” ICLR, 2019.
- Y. Zhu, Y. Xu, F. Yu, Q. Liu, S. Wu, and L. Wang, “Deep graph contrastive representation learning,” in ICML Workshop, 2020.
- ——, “Graph contrastive learning with adaptive augmentation,” in WWW, 2021, pp. 2069–2080.
- Y. Sun, C. Cheng, Y. Zhang, C. Zhang, L. Zheng, Z. Wang, and Y. Wei, “Circle loss: A unified perspective of pair similarity optimization,” in CVPR, 2020, pp. 6398–6407.
- M. Boudiaf, J. Rony, I. M. Ziko, E. Granger, M. Pedersoli, P. Piantanida, and I. B. Ayed, “A unifying mutual information view of metric learning: cross-entropy vs. pairwise losses,” in ECCV, 2020, pp. 548–564.
- T. Chen, S. Kornblith, M. Norouzi, and G. Hinton, “A simple framework for contrastive learning of visual representations,” in ICML, 2020, pp. 1597–1607.
- N. Lee, J. Lee, and C. Park, “Augmentation-free self-supervised learning on graphs,” in AAAI, 2022.
- Y. You, T. Chen, Y. Sui, T. Chen, Z. Wang, and Y. Shen, “Graph contrastive learning with augmentations,” in NeurIPS, vol. 33, 2020, pp. 5812–5823.
- N. Mrabah, M. Bouguessa, M. F. Touati, and R. Ksantini, “Rethinking graph autoencoder models for attributed graph clustering,” arXiv preprint arXiv:2107.08562, 2021.
- M. U. Gutmann and A. Hyvärinen, “Noise-contrastive estimation of unnormalized statistical models, with applications to natural image statistics.” JMLR, vol. 13, no. 2, 2012.
- C. Mavromatis and G. Karypis, “Graph infoclust: Maximizing coarse-grain mutual information in graphs,” in PAKDD, 2021, pp. 541–553.
- M. D. Hoffman and M. J. Johnson, “Elbo surgery: yet another way to carve up the variational evidence lower bound,” in NeurIPS workshop, vol. 1, no. 2, 2016.
- T. R. Davidson, L. Falorsi, N. De Cao, T. Kipf, and J. M. Tomczak, “Hyperspherical variational auto-encoders,” in UAI, vol. 2, 2018, pp. 856–865.
- J. Li, J. Yu, J. Li, H. Zhang, K. Zhao, Y. Rong, H. Cheng, and J. Huang, “Dirichlet graph variational autoencoder,” NeurIPS, vol. 33, pp. 5274–5283, 2020.
- B. Hui, P. Zhu, and Q. Hu, “Collaborative graph convolutional networks: Unsupervised learning meets semi-supervised learning,” in AAAI, vol. 34, no. 04, 2020, pp. 4215–4222.
- S. Zhao, J. Song, and S. Ermon, “Infovae: Balancing learning and inference in variational autoencoders,” in AAAI, vol. 33, no. 01, 2019, pp. 5885–5892.
- A. v. d. Oord, Y. Li, and O. Vinyals, “Representation learning with contrastive predictive coding,” arXiv preprint arXiv:1807.03748, 2018.
- S. Thakoor, C. Tallec, M. G. Azar, M. Azabou, E. L. Dyer, R. Munos, P. Veličković, and M. Valko, “Large-scale representation learning on graphs via bootstrapping,” in ICLR, 2022.
- S. Pan, R. Hu, G. Long, J. Jiang, L. Yao, and C. Zhang, “Adversarially regularized graph autoencoder for graph embedding,” in IJCAI, 2018, pp. 2609–2615.
- C. Wang, S. Pan, G. Long, X. Zhu, and J. Jiang, “Mgae: Marginalized graph autoencoder for graph clustering,” in CIKM, 2017, pp. 889–898.
- J. Park, M. Lee, H. J. Chang, K. Lee, and J. Y. Choi, “Symmetric graph convolutional autoencoder for unsupervised graph representation learning,” in ICCV, 2019, pp. 6519–6528.
- X. Zhang, H. Liu, Q. Li, and X.-M. Wu, “Attributed graph clustering via adaptive graph convolution,” in IJCAI, 2019, pp. 4327–4333.
- T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation of word representations in vector space,” arXiv preprint arXiv:1301.3781, 2013.
- Z. Peng, W. Huang, M. Luo, Q. Zheng, Y. Rong, T. Xu, and J. Huang, “Graph representation learning via graphical mutual information maximization,” in WWW, 2020, p. 259–270.
- K. Hassani and A. H. Khasahmadi, “Contrastive multi-view representation learning on graphs,” in ICML, 2020, pp. 3451–3461.
- D. Bo, X. Wang, C. Shi, M. Zhu, E. Lu, and P. Cui, “Structural deep clustering network,” in WWW, 2020, pp. 1400–1410.
- C. Wang, S. Pan, R. Hu, G. Long, J. Jiang, and C. Zhang, “Attributed graph clustering: A deep attentional embedding approach,” in IJCAI, 2019, pp. 3670–3676.
- G. Cui, J. Zhou, C. Yang, and Z. Liu, “Adaptive graph encoder for attributed graph embedding,” in SIGKDD, 2020, pp. 976–985.
- Q. Liu, M. Allamanis, M. Brockschmidt, and A. Gaunt, “Constrained graph variational autoencoders for molecule design,” NeurIPS, vol. 31, 2018.
- Z. Jiang, Y. Zheng, H. Tan, B. Tang, and H. Zhou, “Variational deep embedding: an unsupervised and generative approach to clustering,” in IJCAI, 2017, pp. 1965–1972.
- A. Hasanzadeh, E. Hajiramezanali, K. Narayanan, N. Duffield, M. Zhou, and X. Qian, “Semi-implicit graph variational auto-encoders,” in NeurIPS, vol. 32, 2019, pp. 10 712–10 723.
- A. Grover, A. Zweig, and S. Ermon, “Graphite: Iterative generative modeling of graphs,” in ICML, vol. 97, 2019, pp. 2434–2444.
- J. Li, J. Li, Y. Liu, J. Yu, Y. Li, and H. Cheng, “Deconvolutional networks on graph data,” in NeurIPS, A. Beygelzimer, Y. Dauphin, P. Liang, and J. W. Vaughan, Eds., 2021.
- S. R. Bowman, L. Vilnis, O. Vinyals, A. M. Dai, R. Jozefowicz, and S. Bengio, “Generating sentences from a continuous space,” arXiv preprint arXiv:1511.06349, 2015.
- T. N. Kipf and M. Welling, “Semi-supervised classification with graph convolutional networks.” in ICLR, 2017.
- J. Xie, R. Girshick, and A. Farhadi, “Unsupervised deep embedding for clustering analysis,” in ICML, 2016, pp. 478–487.
- D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.
- H. Zhu, K. Sun, and P. Koniusz, “Contrastive laplacian eigenmaps,” NeurIPS, vol. 34, 2021.
- P. Sen, G. Namata, M. Bilgic, L. Getoor, B. Galligher, and T. Eliassi-Rad, “Collective classification in network data,” AI magazine, vol. 29, no. 3, pp. 93–93, 2008.
- C. Yang, Z. Liu, D. Zhao, M. Sun, and E. Chang, “Network representation learning with rich text information,” in IJCAI, 2015.
- Y. Burda, R. B. Grosse, and R. Salakhutdinov, “Importance weighted autoencoders,” in ICLR, 2016.
- E. Facco, M. d’Errico, A. Rodriguez, and A. Laio, “Estimating the intrinsic dimension of datasets by a minimal neighborhood information,” Scientific reports, vol. 7, no. 1, pp. 1–8, 2017.
- D. Arpit, S. Jastrzębski, N. Ballas, D. Krueger, E. Bengio, M. S. Kanwal, T. Maharaj, A. Fischer, A. Courville, Y. Bengio et al., “A closer look at memorization in deep networks,” in ICML, 2017, pp. 233–242.