Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Graph Pre-Training Models Are Strong Anomaly Detectors (2410.18487v1)

Published 24 Oct 2024 in cs.LG

Abstract: Graph Anomaly Detection (GAD) is a challenging and practical research topic where Graph Neural Networks (GNNs) have recently shown promising results. The effectiveness of existing GNNs in GAD has been mainly attributed to the simultaneous learning of node representations and the classifier in an end-to-end manner. Meanwhile, graph pre-training, the two-stage learning paradigm such as DGI and GraphMAE, has shown potential in leveraging unlabeled graph data to enhance downstream tasks, yet its impact on GAD remains under-explored. In this work, we show that graph pre-training models are strong graph anomaly detectors. Specifically, we demonstrate that pre-training is highly competitive, markedly outperforming the state-of-the-art end-to-end training models when faced with limited supervision. To understand this phenomenon, we further uncover pre-training enhances the detection of distant, under-represented, unlabeled anomalies that go beyond 2-hop neighborhoods of known anomalies, shedding light on its superior performance against end-to-end models. Moreover, we extend our examination to the potential of pre-training in graph-level anomaly detection. We envision this work to stimulate a re-evaluation of pre-training's role in GAD and offer valuable insights for future research.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (72)
  1. Charu C Aggarwal. 2013. Outlier ensembles: position paper. ACM SIGKDD Explorations Newsletter 14, 2 (2013), 49–58.
  2. Fake news, disinformation and misinformation in social media: a review. Social Network Analysis and Mining 13, 1 (2023), 30.
  3. Oddball: Spotting anomalies in weighted graphs. In Advances in Knowledge Discovery and Data Mining: 14th Pacific-Asia Conference, PAKDD 2010, Hyderabad, India, June 21-24, 2010. Proceedings. Part II 14. Springer, 410–421.
  4. Anomaly detection in online social network: A survey. In 2017 International Conference on Inventive Communication and Computational Technologies (ICICCT). IEEE, 456–459.
  5. Alessandro Bondielli and Francesco Marcelloni. 2019. A survey on fake news and rumour detection techniques. Information Sciences 497 (2019), 38–55.
  6. Protein function prediction via graph kernels. Bioinformatics 21, suppl_1 (2005), i47–i56.
  7. Chen Cai and Yusu Wang. 2018. A simple yet effective baseline for non-attributed graph classification. arXiv preprint arXiv:1811.03508 (2018).
  8. Wiener Graph Deconvolutional Network Improves Graph Self-Supervised Learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 37. 7131–7139.
  9. Ailin Deng and Bryan Hooi. 2021. Graph neural network-based anomaly detection in multivariate time series. In Proceedings of the AAAI conference on artificial intelligence. 4027–4035.
  10. Deep anomaly detection on attributed networks. In Proceedings of the SDM. SIAM, 594–602.
  11. Enhancing graph neural network-based fraud detectors against camouflaged fraudsters. In CIKM. 315–324.
  12. Graph Anomaly Detection via Multi-Scale Contrastive Learning Networks with Augmented View. arXiv preprint arXiv:2212.00535 (2022).
  13. SLAPS: Self-supervision improves structure learning for graph neural networks, Vol. 34. 22667–22681.
  14. Detecting sensor faults, anomalies and outliers in the internet of things: A survey on the challenges and solutions. Electronics 9, 3 (2020), 511.
  15. Addressing Heterophily in Graph Anomaly Detection: A Perspective of Graph Spectrum. In Proceedings of the ACM Web Conference.
  16. Inductive representation learning on large graphs. In NeurIPS.
  17. Data Mining: Concepts and Techniques, 3rd edition. Morgan Kaufmann.
  18. ADBench: Anomaly detection benchmark. In Advances in Neural Information Processing Systems (NeurIPS).
  19. Kaveh Hassani and Amir Hosein Khasahmadi. 2020. Contrastive multi-view representation learning on graphs. In International conference on machine learning. PMLR, 4116–4126.
  20. Bayesian anomaly detection methods for social networks. The Annals of Applied Statistics 4, 2 (2010), 645–662.
  21. Financial fraud: a review of anomaly detection techniques and recent advances. Expert systems With applications 193 (2022), 116429.
  22. Geoffrey E Hinton and Richard Zemel. 1993. Autoencoders, minimum description length and Helmholtz free energy. Advances in neural information processing systems 6 (1993).
  23. GraphMAE2: A Decoding-Enhanced Masked Self-Supervised Graph Learner. In Proceedings of the ACM Web Conference 2023. 737–746.
  24. Graphmae: Self-supervised masked graph autoencoders. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 594–604.
  25. Dgraph: A large-scale financial dataset for graph anomaly detection. In Thirty-sixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track.
  26. Derivation and validation of toxicophores for mutagenicity prediction. Journal of medicinal chemistry 48, 1 (2005), 312–320.
  27. Thomas N Kipf and Max Welling. 2016. Variational Graph Auto-Encoders. In NeurIPS Workshop on Bayesian Deep Learning.
  28. Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In ICLR.
  29. Predicting dynamic embedding trajectory in temporal interaction networks. In Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining. 1269–1278.
  30. DAGAD: Data Augmentation for Graph Anomaly Detection. arXiv preprint arXiv:2210.09766 (2022).
  31. BOND: Benchmarking Unsupervised Outlier Node Detection on Static Attributed Graphs. In Advances in Neural Information Processing Systems, Vol. 35.
  32. Pick and Choose: A GNN-based Imbalanced Learning Approach for Fraud Detection. In Proceedings of the Web Conference 2021.
  33. Graph self-supervised learning: A survey. IEEE Transactions on Knowledge and Data Engineering (2022).
  34. Mul-GAD: a semi-supervised graph anomaly detection framework via aggregating multi-view information. arXiv preprint arXiv:2212.05478 (2022).
  35. Revisiting Graph Contrastive Learning for Anomaly Detection. arXiv preprint arXiv:2305.02496 (2023).
  36. Geniepath: Graph neural networks with adaptive receptive paths. In Proceedings of the AAAI Conference on Artificial Intelligence. 4424–4431.
  37. Alleviating the inconsistency problem of applying graph neural network to fraud detection. In SIGIR. 1569–1572.
  38. Deep graph-level anomaly detection by glocal knowledge distillation. In Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining. 704–714.
  39. A comprehensive survey on graph anomaly detection with deep learning. IEEE Transactions on Knowledge and Data Engineering (2021).
  40. Towards graph-level anomaly detection via deep evolutionary mapping. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 1631–1642.
  41. Julian John McAuley and Jure Leskovec. 2013. From amateurs to connoisseurs: modeling the evolution of user expertise through online reviews. In WWW.
  42. TUDataset: A collection of benchmark datasets for learning with graphs. In Proceedings of the ICML Workshop on Graph Representation Learning and Beyond. 1–11.
  43. Caleb C Noble and Diane J Cook. 2003. Graph-based anomaly detection. In KDD.
  44. Adversarially regularized graph autoencoder for graph embedding. In IJCAI.
  45. A critical look at the evaluation of GNNs under heterophily: are we really making progress?. In ICLR.
  46. Raising the Bar in Graph-level Anomaly Detection. https://arxiv.org/abs/2205.13845
  47. Shebuti Rayana and Leman Akoglu. 2015. Collective opinion spam detection: Bridging review networks and metadata. In Proceedings of the 21th acm sigkdd international conference on knowledge discovery and data mining. 985–994.
  48. Kaspar Riesen and Horst Bunke. 2008. IAM graph database repository for graph based pattern recognition and machine learning. In Structural, Syntactic, and Statistical Pattern Recognition: Joint IAPR International Workshop, SSPR & SPR 2008, Orlando, USA, December 4-6, 2008. Proceedings. Springer, 287–297.
  49. Frank Rosenblatt. 1958. The perceptron: a probabilistic model for information storage and organization in the brain. Psychological review 65, 6 (1958), 386.
  50. Weisfeiler-lehman graph kernels. Journal of Machine Learning Research 12, 9 (2011).
  51. S2GAE: Self-Supervised Graph Autoencoders are Generalizable Learners with Graph Masking. In Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining. 787–795.
  52. GADBench: Revisiting and Benchmarking Supervised Graph Anomaly Detection. In Thirty-seventh Conference on Neural Information Processing Systems Datasets and Benchmarks Track.
  53. Rethinking Graph Neural Networks for Anomaly Detection. In International Conference on Machine Learning.
  54. Large-Scale Representation Learning on Graphs via Bootstrapping. In ICLR.
  55. Graph attention networks. In ICLR.
  56. Deep graph infomax. In ICLR.
  57. A semi-supervised graph attentive network for financial fraud detection. In ICDM. IEEE, 598–607.
  58. Decoupling representation learning and classification for gnn-based anomaly detection. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. 1239–1248.
  59. Anti-money laundering in bitcoin: Experimenting with graph convolutional networks for financial forensics. arXiv preprint arXiv:1908.02591 (2019).
  60. Self-supervised learning of graph neural networks: A unified review. IEEE Transactions on Pattern Analysis and Machine Intelligence (2022).
  61. How powerful are graph neural networks? ICLR (2019).
  62. Pinar Yanardag and SVN Vishwanathan. 2015. Deep graph kernels. In Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining. 1365–1374.
  63. FRAUDRE: Fraud Detection Dual-Resistant to Graph Inconsistency and Imbalance. In 2021 IEEE International Conference on Data Mining (ICDM). IEEE, 867–876.
  64. Dual-discriminative graph neural network for imbalanced graph-level anomaly detection. Advances in Neural Information Processing Systems 35 (2022), 24144–24157.
  65. From canonical correlation analysis to self-supervised graph neural networks. In NeurIPS.
  66. COSTA: covariance-preserving feature augmentation for graph contrastive learning. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 2524–2534.
  67. Spectral feature augmentation for graph contrastive learning and beyond. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 37. 11289–11297.
  68. Lingxiao Zhao and Leman Akoglu. 2021. On using classification datasets to evaluate graph outlier detection: Peculiar observations and new insights. Big Data (2021).
  69. Error-Bounded Graph Anomaly Loss for GNNs. In CIKM.
  70. AddGraph: Anomaly Detection in Dynamic Graph Using Attention-based Temporal GCN.. In IJCAI. 4419–4425.
  71. Beyond homophily in graph neural networks: Current limitations and effective designs. Advances in Neural Information Processing Systems 33 (2020).
  72. Deep Graph Contrastive Representation Learning. In ICML Workshop on Graph Representation Learning and Beyond.

Summary

  • The paper demonstrates that graph pre-training models significantly outperform end-to-end approaches in detecting anomalies, especially under limited supervision.
  • It employs GNN backbones such as GCN and GIN with negative sampling to effectively identify distant and under-represented anomalies in sparse graphs.
  • The findings highlight the potential of pre-training to enhance real-world applications like fraud detection, cybersecurity, and social network analysis.

Insights into "Graph Pre-Training Models Are Strong Anomaly Detectors"

The paper "Graph Pre-Training Models Are Strong Anomaly Detectors" investigates the role of graph pre-training in enhancing Graph Anomaly Detection (GAD) tasks. The authors explore the potency of graph pre-training as a robust alternative to traditional end-to-end approaches for detecting anomalies in graph-structured data. Through an extensive evaluation, the paper uncovers significant insights into when and why pre-training models excel in this domain.

Context and Motivation

Graph Anomaly Detection is crucial due to its applicability in various domains such as fraud detection, cybersecurity, and social network analysis. Traditional methods, dependent on manually crafted features and statistical models, fall short when addressing unseen anomalies due to their labor-intensive nature. Recent endeavors leveraging Graph Neural Networks (GNNs) aim to address these challenges by autonomously learning representational patterns. Yet, the potential of graph pre-training, a two-stage learning paradigm, has not been fully exploited in GAD.

Methodology and Key Findings

The paper proposes a systematic investigation into the effectiveness of graph pre-training for GAD by addressing two fundamental questions:

  1. When do graph pre-training models outperform other models in GAD tasks?
  2. Why are these models effective?

The authors employ standard GNN architectures (e.g., GCN and GIN) as backbones and explore both semi-supervised and fully-supervised settings. The results indicate that in scenarios of limited supervision, pre-training models substantially outperform state-of-the-art end-to-end training models by a significant margin. Notably, pre-training demonstrates superior performance on lower-density graphs, suggesting a correlation between graph sparsity and pre-training effectiveness.

The research identifies that pre-training excels particularly in scenarios where it enhances the detection of distant, under-represented anomalies beyond the known 2-hop neighborhoods. This capability is attributed to the specific mechanisms within pre-training, such as leveraging negative sampling to generate 'pseudo anomalies.'

Theoretical and Practical Implications

The findings suggest a reevaluation of the role of pre-training in GAD tasks. From a theoretical perspective, this paper highlights the ability of pre-training to amplify the reach of label information in sparse graphs, thus addressing the limitation of traditional end-to-end methods' reliance on nearby label propagation. Practically, the paper suggests that leveraging pre-training models can improve detection in environments with insufficient labeled anomalies—a common scenario in real-world applications.

Future Directions

Looking ahead, this paper provides a foundation for further research into pre-training paradigms. Future work could explore hybrid models that integrate pre-training with advanced classifiers, delve into more complex graph-level anomaly detection tasks, or enhance pre-training models for specific domains. Additionally, there is room for exploring different self-supervised objectives that could further refine the distinct mechanisms offered by pre-training, contributing to even more robust anomaly detection systems.

In summary, the paper offers a compelling case for incorporating graph pre-training models in GAD, emphasizing their strengths in specific scenarios and paving the way for subsequent advancements in this area.

X Twitter Logo Streamline Icon: https://streamlinehq.com