Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
120 tokens/sec
GPT-4o
10 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
3 tokens/sec
DeepSeek R1 via Azure Pro
55 tokens/sec
2000 character limit reached

Encoder Embedding for General Graph and Node Classification (2405.15473v2)

Published 24 May 2024 in stat.ML, cs.LG, and cs.SI

Abstract: Graph encoder embedding, a recent technique for graph data, offers speed and scalability in producing vertex-level representations from binary graphs. In this paper, we extend the applicability of this method to a general graph model, which includes weighted graphs, distance matrices, and kernel matrices. We prove that the encoder embedding satisfies the law of large numbers and the central limit theorem on a per-observation basis. Under certain condition, it achieves asymptotic normality on a per-class basis, enabling optimal classification through discriminant analysis. These theoretical findings are validated through a series of experiments involving weighted graphs, as well as text and image data transformed into general graph representations using appropriate distance metrics.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (42)
  1. Graph based anomaly detection and description: A survey. Data Mining and Knowledge Discovery, 29(3):626–688, 2015.
  2. Statistical inference on random dot product graphs: a survey. Journal of Machine Learning Research, 18(226):1–92, 2018.
  3. Network biology: Understanding the cell’s functional organization. Nature Reviews Genetics, 5(2):101–113, 2004.
  4. Latent dirichlet allocation. Journal of Machine Learning Research, 3:993–1022, 2003.
  5. Complex networks: Structure and dynamics. Physics Reports, 424(4-5):175–308, 2006.
  6. Learning a spatially smooth subspace for face recognition. In Proceedings of IEEE Conference Computer Vision and Pattern Recognition Machine Learning (CVPR’07), 2007.
  7. Optimization techniques for semi-supervised support vector machines. The Journal of Machine Learning Research, 9:203–233, 2008.
  8. R. Cole and M. Fanty. Spoken letter recognition. In Proc. Third DARPA Speech and Natural Language Workshop, 1990.
  9. A Probabilistic Theory of Pattern Recognition. Springer, 1996.
  10. From few to many: Illumination cone models for face recognition under variable lighting and pose. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(6):643–660, 2001.
  11. Citeseer: An automatic citation indexing system. In Proceedings of the Third ACM Conference on Digital Libraries, pages 89–98, 1998.
  12. M. Girvan and M. E. J. Newman. Community structure in social and biological networks. Proceedings of National Academy of Science, 99(12):7821–7826, 2002.
  13. A. Grover and J. Leskovec. node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining, pages 855–864, 2016.
  14. Face recognition using Laplacianfaces. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(3):328–340, 2005.
  15. Stochastic blockmodels: First steps. Social Networks, 5(2):109–137, 1983.
  16. B. Karrer and M. E. J. Newman. Stochastic blockmodels and community structure in networks. Physical Review E, 83:016107, 2011.
  17. M. G. Kendall. Rank Correlation Methods. London: Griffin, 1970.
  18. T. N. Kipf and M. Welling. Semi-supervised classification with graph convolutional networks. In International Conference on Learning Representations, 2017.
  19. Acquiring linear subspaces for face recognition under variable lighting. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(5):684–698, 2005.
  20. R. Liu and A. Krishnan. Pecanpy: a fast, efficient and parallelized python implementation of node2vec. Bioinformatics, 37(19):3377–3379, 2021.
  21. Automating the construction of internet portals with machine learning. Information Retrieval, 3:127–163, 2000.
  22. M. E. J. Newman. The structure and function of complex networks. SIAM Review, 45(2):167–256, 2003.
  23. Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 701–710. ACM, 2014.
  24. On a ’two truths’ phenomenon in spectral graph clustering. Proceedings of the National Academy of Sciences, 116(13):5995–5600, 2019.
  25. Anomaly detection in dynamic networks: a survey. Wiley Interdisciplinary Reviews: Computational Statistics, 7(3):223–247, 2015.
  26. Spectral clustering and the high-dimensional stochastic blockmodel. Annals of Statistics, 39(4):1878–1915, 2011.
  27. C. Shen and J. T. Vogelstein. The exact equivalence of distance and kernel methods in hypothesis testing. AStA Advances in Statistical Analysis, 105(3):385–403, 2021.
  28. Generalized canonical correlation analysis for classification. Journal of Multivariate Analysis, 130:310–322, 2014.
  29. Manifold matching using shortest-path distance and joint neighborhood selection. Pattern Recognition Letters, 92:41–48, 2017.
  30. Graph encoder ensemble for simultaneous vertex embedding and community detection. In 2023 2nd International Conference on Algorithms, Data Mining, and Information Technology. ACM, 2023a.
  31. Synergistic graph fusion via encoder embedding. https://arxiv.org/abs/2303.18051, 2023b.
  32. One-hot graph encoder embedding. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(6):7933 – 7938, 2023c.
  33. Refined graph encoder embedding via self-training and latent community recovery. https://arxiv.org/abs/2405.12797, 2024a.
  34. Discovering communication pattern shifts in large-scale labeled networks using encoder embedding and vertex dynamics. IEEE Transactions on Network Science and Engineering, 11(2):2100 – 2109, 2024b.
  35. The CMU pose, illumination, and expression database. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(12):1615–1618, 2003.
  36. T. Snijders and K. Nowicki. Estimation and prediction for stochastic blockmodels for graphs with latent block structure. Journal of Classification, 14(1):75–100, 1997.
  37. A consistent adjacency spectral embedding for stochastic blockmodel graphs. Journal of the American Statistical Association, 107(499):1119–1128, 2012.
  38. The anatomy of the facebook social graph. arXiv preprint arXiv:1111.4503, 2011.
  39. Structural properties of the caenorhabditis elegans neuronal network. PLoS Computational Biology, 7(2):e1001066, 2011.
  40. A comprehensive survey on graph neural networks. IEEE Transactions on Neural Networks and Learning Systems, 32:4–24, 2019.
  41. S. Young and E. Scheinerman. Random dot product graph models for social networks. In Algorithms and Models for the Web-Graph, pages 138–149. Springer Berlin Heidelberg, 2007.
  42. Consistency of community detection in networks under degree-corrected stochastic block models. Annals of Statistics, 40(4):2266–2292, 2012.
Citations (3)

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

Authors (1)

X Twitter Logo Streamline Icon: https://streamlinehq.com