Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Representation Learning for Frequent Subgraph Mining (2402.14367v1)

Published 22 Feb 2024 in cs.LG and cs.SI

Abstract: Identifying frequent subgraphs, also called network motifs, is crucial in analyzing and predicting properties of real-world networks. However, finding large commonly-occurring motifs remains a challenging problem not only due to its NP-hard subroutine of subgraph counting, but also the exponential growth of the number of possible subgraphs patterns. Here we present Subgraph Pattern Miner (SPMiner), a novel neural approach for approximately finding frequent subgraphs in a large target graph. SPMiner combines graph neural networks, order embedding space, and an efficient search strategy to identify network subgraph patterns that appear most frequently in the target graph. SPMiner first decomposes the target graph into many overlapping subgraphs and then encodes each subgraph into an order embedding space. SPMiner then uses a monotonic walk in the order embedding space to identify frequent motifs. Compared to existing approaches and possible neural alternatives, SPMiner is more accurate, faster, and more scalable. For 5- and 6-node motifs, we show that SPMiner can almost perfectly identify the most frequent motifs while being 100x faster than exact enumeration methods. In addition, SPMiner can also reliably identify frequent 10-node motifs, which is well beyond the size limit of exact enumeration approaches. And last, we show that SPMiner can find large up to 20 node motifs with 10-100x higher frequency than those found by current approximate methods.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (56)
  1. Large-scale analysis of disease pathways in the human interactome. In PSB. World Scientific, 2018.
  2. Topology of evolving networks: local events and universality. Physical Review Letters, 2000.
  3. Simgnn: A neural network approach to fast graph similarity computation. In WSDM, 2019.
  4. Neural maximum common subgraph detection with guided subgraph extraction, 2020.
  5. Higher-order organization of complex networks. Science, 2016.
  6. Protein function prediction via graph kernels. Bioinformatics, 2005.
  7. Motivo: fast motif counting via succinct color coding and adaptive sampling. Proceedings of the VLDB Endowment, 12(11):1651–1663, 2019.
  8. Faster motif counting via succinct color coding and adaptive sampling. ACM Transactions on Knowledge Discovery from Data (TKDD), 15(6):1–27, 2021.
  9. What is frequent in a single graph? In PAKDD, pp.  858–863. Springer, 2008.
  10. Molecular fingerprint similarity search in virtual screening. Methods, 2015.
  11. A general framework for estimating graphlet statistics via random walk. arXiv preprint arXiv:1603.07504, 2016.
  12. Can graph neural networks count substructures? arXiv preprint arXiv:2002.04025, 2020.
  13. A (sub) graph isomorphism algorithm for matching large graphs. PAMI, 2004.
  14. Coulom, R. Efficient selectivity and backup operators in monte-carlo tree search. In International conference on computers and games. Springer, 2006.
  15. Grami: Frequent subgraph and pattern mining in a single large graph. Proceedings of the VLDB Endowment, 7(7):517–528, 2014.
  16. On random graphs. i. Publ. Math. Debrecen, 1959.
  17. Fast graph representation learning with PyTorch Geometric. In ICLR Workshop on Representation Learning on Graphs and Manifolds, 2019.
  18. Deep graph matching consensus. In ICLR, 2020.
  19. Neural graph matching networks for fewshot 3d action recognition. In ECCV, 2018.
  20. Inductive representation learning on large graphs. In NeurIPS, 2017.
  21. A combinatorial approach to graphlet counting. Bioinformatics, 2014.
  22. Growing scale-free networks with tunable clustering. Physical review E, 2002.
  23. Open graph benchmark: Datasets for machine learning on graphs. Advances in neural information processing systems, 33:22118–22133, 2020.
  24. Densely connected convolutional networks. In CVPR, 2017.
  25. An apriori-based algorithm for mining frequent substructures from graph data. In Zighed, D. A., Komorowski, J., and Żytkow, J. (eds.), Principles of Data Mining and Knowledge Discovery. Springer Berlin Heidelberg, 2000.
  26. Path sampling: A fast and provable method for estimating 4-vertex subgraph counts. In Proceedings of the 24th international conference on world wide web, pp.  495–505, 2015.
  27. A survey of frequent subgraph mining algorithms. The Knowledge Engineering Review, 28:75 – 105, 2012.
  28. Efficient sampling algorithm for estimating subgraph concentrations and detecting network motifs. Bioinformatics, 2004.
  29. Benchmark data sets for graph kernels, 2016. URL http://graphkernels.cs.tu-dortmund.de.
  30. Subdue: Compression-based frequent pattern discovery in graph data. In Proceedings of the 1st international workshop on open source data mining: frequent pattern mining implementations, 2005.
  31. Semi-supervised classification with graph convolutional networks. In ICLR, 2017.
  32. Bandit based monte-carlo planning. In European conference on machine learning. Springer, 2006.
  33. An efficient algorithm for discovering frequent subgraphs. IEEE TKDE, 2004.
  34. Signed networks in social media. In SIGCHI, 2010.
  35. Deepgcns: Can gcns go as deep as cnns? In ICCV, 2019a.
  36. Graph matching networks for learning the similarity of graph structured objects. In ICML, 2019b.
  37. Neural subgraph isomorphism counting. Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2020.
  38. Extension of graph-based induction for general graph structured data. In PAKDD, 2000.
  39. Graphzero: A high-performance subgraph matching system. ACM SIGOPS Operating Systems Review, 55(1):21–37, 2021.
  40. Propagation kernels: efficient graph kernels from propagated information. Machine Learning, 2016.
  41. The gaston tool for frequent subgraph mining. Electronic Notes in Theoretical Computer Science, 2005.
  42. Escape: Efficiently counting all 5-vertex subgraphs. In Proceedings of the 26th international conference on world wide web, pp.  1431–1440, 2017.
  43. A survey on subgraph counting: concepts, algorithms, and applications to network motifs and graphlets. ACM Computing Surveys (CSUR), 54(2):1–36, 2021.
  44. Iam graph database repository for graph based pattern recognition and machine learning. In Joint IAPR International Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic Pattern Recognition (SSPR), pp.  287–297. Springer, 2008.
  45. The network data repository with interactive graph analytics and visualization. In AAAI, 2015.
  46. Graphpi: High performance graph pattern matching through effective redundancy elimination. In SC20: International Conference for High Performance Computing, Networking, Storage and Analysis, pp.  1–14. IEEE, 2020.
  47. Spline-fitting with a genetic algorithm: A method for developing classification structure- activity relationships. Journal of chemical information and computer sciences, 2003.
  48. Order-embeddings of images and language. In Bengio, Y. and LeCun, Y. (eds.), ICLR, 2016. URL http://arxiv.org/abs/1511.06361.
  49. Collective dynamics of ‘small-world’networks. Nature, 1998.
  50. Wernicke, S. Efficient detection of network motifs. IEEE/ACM transactions on computational biology and bioinformatics, 2006.
  51. Representation learning on graphs with jumping knowledge networks. In ICML, 2018.
  52. How powerful are graph neural networks? In ICLR, 2019a.
  53. Cross-lingual knowledge graph alignment via graph matching neural network. 2019b.
  54. gspan: Graph-based substructure pattern mining. In ICDM. IEEE, 2002.
  55. Neural subgraph matching. arXiv preprint, 2020.
  56. Gnnexplainer: Generating explanations for graph neural networks. In NeurIPS, 2019.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets