Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
194 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Truncated Affinity Maximization: One-class Homophily Modeling for Graph Anomaly Detection (2306.00006v5)

Published 29 May 2023 in cs.SI, cs.AI, and cs.LG

Abstract: We reveal a one-class homophily phenomenon, which is one prevalent property we find empirically in real-world graph anomaly detection (GAD) datasets, i.e., normal nodes tend to have strong connection/affinity with each other, while the homophily in abnormal nodes is significantly weaker than normal nodes. However, this anomaly-discriminative property is ignored by existing GAD methods that are typically built using a conventional anomaly detection objective, such as data reconstruction. In this work, we explore this property to introduce a novel unsupervised anomaly scoring measure for GAD, local node affinity, that assigns a larger anomaly score to nodes that are less affiliated with their neighbors, with the affinity defined as similarity on node attributes/representations. We further propose Truncated Affinity Maximization (TAM) that learns tailored node representations for our anomaly measure by maximizing the local affinity of nodes to their neighbors. Optimizing on the original graph structure can be biased by nonhomophily edges (i.e., edges connecting normal and abnormal nodes). Thus, TAM is instead optimized on truncated graphs where non-homophily edges are removed iteratively to mitigate this bias. The learned representations result in significantly stronger local affinity for normal nodes than abnormal nodes. Extensive empirical results on 10 real-world GAD datasets show that TAM substantially outperforms seven competing models, achieving over 10% increase in AUROC/AUPRC compared to the best contenders on challenging datasets. Our code is available at https://github.com/mala-lab/TAM-master/.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (66)
  1. Oddball: Spotting anomalies in weighted graphs. In Advances in Knowledge Discovery and Data Mining: 14th Pacific-Asia Conference, PAKDD 2010, Hyderabad, India, June 21-24, 2010. Proceedings. Part II 14, pages 410–421. Springer, 2010.
  2. Graph based anomaly detection and description: a survey. Data mining and knowledge discovery, 29:626–688, 2015.
  3. Area under the precision-recall curve: point estimates and confidence intervals. In Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2013, Prague, Czech Republic, September 23-27, 2013, Proceedings, Part III 13, pages 451–466. Springer, 2013.
  4. Can abnormality be detected by graph neural networks? In Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence (IJCAI), Vienna, Austria, pages 23–29, 2022.
  5. Anomaly detection: A survey. ACM computing surveys (CSUR), 41(3):1–58, 2009.
  6. Gccad: Graph contrastive learning for anomaly detection. IEEE Transactions on Knowledge and Data Engineering, 2022.
  7. Fast gradient attack on network embedding. arXiv preprint arXiv:1809.02797, 2018.
  8. Deep anomaly detection on attributed networks. In Proceedings of the 2019 SIAM International Conference on Data Mining, pages 594–602. SIAM, 2019.
  9. Bi-level selection via meta gradient for graph-based fraud detection. In Database Systems for Advanced Applications: 27th International Conference, DASFAA 2022, Virtual Event, April 11–14, 2022, Proceedings, Part I, pages 387–394. Springer, 2022.
  10. Enhancing graph neural network-based fraud detectors against camouflaged fraudsters. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management, pages 315–324, 2020.
  11. Anomalydae: Dual autoencoder for anomaly detection on attributed networks. In ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 5685–5689. IEEE, 2020.
  12. Addressing heterophily in graph anomaly detection: A perspective of graph spectrum. In Proceedings of the ACM Web Conference 2023, pages 1528–1538, 2023.
  13. Alleviating structural distribution shift in graph anomaly detection. In Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining, pages 357–365, 2023.
  14. Open graph benchmark: Datasets for machine learning on graphs. Advances in neural information processing systems, 33:22118–22133, 2020.
  15. Hybrid-order anomaly detection on attributed networks. IEEE Transactions on Knowledge and Data Engineering, 2021.
  16. Auc-oriented graph neural network for fraud detection. In Proceedings of the ACM Web Conference 2022, pages 1311–1321, 2022.
  17. Hop-count based self-supervised anomaly detection on attributed networks. arXiv preprint arXiv:2104.07917, 2021.
  18. Anemone: graph anomaly detection with multi-scale contrastive learning. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management, pages 3122–3126, 2021.
  19. D. Kingma and J. Ba. Adam: A method for stochastic optimization. Computer Science, 2014.
  20. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907, 2016.
  21. Predicting dynamic embedding trajectory in temporal interaction networks. In Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining, pages 1269–1278, 2019.
  22. Radar: Residual analysis for anomaly detection in attributed networks. In IJCAI, pages 2152–2158, 2017.
  23. Dual-augment graph neural network for fraud detection. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management, pages 4188–4192, 2022.
  24. Specae: Spectral autoencoder for anomaly detection in attributed networks. In Proceedings of the 28th ACM international conference on information and knowledge management, pages 2233–2236, 2019.
  25. Isolation-based anomaly detection. ACM Transactions on Knowledge Discovery from Data (TKDD), 6(1):1–39, 2012.
  26. Bond: Benchmarking unsupervised outlier node detection on static attributed graphs. In Thirty-sixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track, 2022.
  27. Pick and choose: a gnn-based imbalanced learning approach for fraud detection. In Proceedings of the Web Conference 2021, pages 3168–3177, 2021.
  28. Anomaly detection on attributed networks via contrastive self-supervised learning. IEEE transactions on neural networks and learning systems, 33(6):2378–2392, 2021.
  29. Beyond smoothing: Unsupervised graph representation learning with edge heterophily discriminating. In Proceedings of the AAAI conference on artificial intelligence, volume 37, pages 4516–4524, 2023.
  30. Alleviating the inconsistency problem of applying graph neural network to fraud detection. In Proceedings of the 43rd international ACM SIGIR conference on research and development in information retrieval, pages 1569–1572, 2020.
  31. Comga: Community-aware attributed graph anomaly detection. In Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining, pages 657–665, 2022.
  32. A comprehensive survey on graph anomaly detection with deep learning. IEEE Transactions on Knowledge and Data Engineering, 2021.
  33. Is homophily a necessity for graph neural networks? arXiv preprint arXiv:2106.06134, 2021.
  34. From amateurs to connoisseurs: modeling the evolution of user expertise through online reviews. In Proceedings of the 22nd international conference on World Wide Web, pages 897–908, 2013.
  35. What yelp fake review filter might be doing? In Proceedings of the international AAAI conference on web and social media, volume 7, pages 409–418, 2013.
  36. Graph-based anomaly detection. In Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 631–636, 2003.
  37. Deep learning for anomaly detection: A review. ACM computing surveys (CSUR), 54(2):1–38, 2021.
  38. Toward deep supervised anomaly detection: Reinforcement learning from partially labeled anomaly data. In Proceedings of the 27th ACM SIGKDD conference on knowledge discovery & data mining, pages 1298–1308, 2021.
  39. Resgcn: attention-based deep residual modeling for anomaly detection on attributed networks. Machine Learning, 111(2):519–541, 2022.
  40. Graph representation learning via graphical mutual information maximization. In Proceedings of The Web Conference 2020, pages 259–270, 2020.
  41. Anomalous: A joint modeling approach for anomaly detection on attributed networks. In IJCAI, pages 3513–3519, 2018.
  42. A deep multi-view framework for anomaly detection on attributed networks. IEEE Transactions on Knowledge and Data Engineering, 34(6):2539–2552, 2020.
  43. Scalable anomaly ranking of attributed neighborhoods. In Proceedings of the 2016 SIAM International Conference on Data Mining, pages 207–215. SIAM, 2016.
  44. Collective opinion spam detection: Bridging review networks and metadata. In Proceedings of the 21th acm sigkdd international conference on knowledge discovery and data mining, pages 985–994, 2015.
  45. H2-fdetector: a gnn-based fraud detector with homophilic and heterophilic connections. In Proceedings of the ACM Web Conference 2022, pages 1486–1494, 2022.
  46. Neighborhood formation and anomaly detection in bipartite graphs. In Fifth IEEE International Conference on Data Mining (ICDM’05), pages 8–pp. IEEE, 2005.
  47. Rethinking graph neural networks for anomaly detection. arXiv preprint arXiv:2205.15508, 2022.
  48. Arnetminer: extraction and mining of academic social networks. In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 990–998, 2008.
  49. Relational learning via latent social dimensions. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 817–826, 2009.
  50. Graph attention networks. arXiv preprint arXiv:1710.10903, 2017.
  51. Deep graph infomax. ICLR (Poster), 2(3):4, 2019.
  52. Cross-domain graph anomaly detection via anomaly-aware contrastive alignment. arXiv preprint arXiv:2212.01096, 2022.
  53. One-class graph neural networks for anomaly detection in attributed networks. Neural computing and applications, 33(18):12073–12085, 2021.
  54. Decoupling representation learning and classification for gnn-based anomaly detection. In Proceedings of the 44th international ACM SIGIR conference on research and development in information retrieval, pages 1239–1248, 2021.
  55. How powerful are graph neural networks? arXiv preprint arXiv:1810.00826, 2018.
  56. Contrastive attributed network anomaly detection with data augmentation. In Pacific-Asia Conference on Knowledge Discovery and Data Mining, pages 444–457. Springer, 2022.
  57. Mining fraudsters and fraudulent strategies in large-scale mobile social networks. IEEE Transactions on Knowledge and Data Engineering, 33(1):169–179, 2019.
  58. Fraudre: Fraud detection dual-resistant to graph inconsistency and imbalance. In 2021 IEEE International Conference on Data Mining (ICDM), pages 867–876. IEEE, 2021.
  59. Reconstruction enhanced multi-view contrastive learning for anomaly detection on attributed networks. arXiv preprint arXiv:2205.04816, 2022.
  60. Gcn-based user representation learning for unifying robust recommendation and fraudster detection. In Proceedings of the 43rd international ACM SIGIR conference on research and development in information retrieval, pages 689–698, 2020.
  61. Error-bounded graph anomaly loss for gnns. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management, pages 1873–1882, 2020.
  62. Graph neural networks for graphs with heterophily: A survey. arXiv preprint arXiv:2202.07082, 2022.
  63. Generative and contrastive self-supervised learning for graph anomaly detection. IEEE Transactions on Knowledge and Data Engineering, 2021.
  64. Unseen anomaly detection on networks via multi-hypersphere learning. In Proceedings of the 2022 SIAM International Conference on Data Mining (SDM), pages 262–270. SIAM, 2022.
  65. Subtractive aggregation for attributed network anomaly detection. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management, pages 3672–3676, 2021.
  66. Graph neural networks with heterophily. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 11168–11176, 2021.
Citations (13)

Summary

  • The paper introduces Truncated Affinity Maximization (TAM), a method that leverages one-class homophily to enhance unsupervised graph anomaly detection.
  • It combines Local Affinity Maximization Networks with Normal Structure-preserved Graph Truncation to refine node representations by removing non-homophily edges.
  • TAM shows over 10% improvement in AUROC/AUPRC on multiple datasets, offering robust applications in fraud detection and cybersecurity.

Overview of Truncated Affinity Maximization for Graph Anomaly Detection

Graph anomaly detection (GAD) is a critical task in many real-world applications, ranging from fraud detection to identifying spurious network activities. Despite substantial progress in GAD, existing graph neural network (GNN)-based approaches often overlook intrinsic anomaly-discriminative properties, such as the one-class homophily. This paper introduces a novel method, Truncated Affinity Maximization (TAM), leveraging this property for improved unsupervised anomaly detection in graphs.

One-Class Homophily

The authors reveal one-class homophily as an empirically prevalent property in real-world GAD datasets: normal nodes typically exhibit stronger connections and affinities with each other compared to abnormal nodes. This foundational observation informs the development of a new unsupervised anomaly measure termed "local node affinity," which assigns higher anomaly scores to nodes that demonstrate weaker connectivity with their neighbors. This affinity is defined as the similarity of node attributes or representations, effectively enabling a new perspective for anomaly detection beyond conventional objectives like data reconstruction.

Truncated Affinity Maximization

The core innovation of TAM lies in its approach to learning node representations that inherently exploit local node affinity. The method is composed of two primary components: Local Affinity Maximization Networks (LAMNet) and Normal Structure-preserved Graph Truncation (NSGT). LAMNet aims to extract node representations by maximizing affinity towards neighbors, whereas NSGT iteratively removes non-homophily edges that connect dissimilar nodes, thereby mitigating biases in the optimization process.

Involvement of the truncated graph structure in LAMNet helps in refining the representations by focusing on graph areas with stronger homophily, thereby achieving a more accurate representation of node affinities. The extensive evaluations demonstrate that TAM significantly outperforms seven other state-of-the-art models on ten real-world GAD datasets, exhibiting notable improvements of over 10% in AUROC/AUPRC on certain challenging datasets.

Implications and Future Directions

This research presents a significant contribution to the literature on graph-based anomaly detection, introducing a method that effectively utilizes the overlooked anomaly-discriminative property—one-class homophily. Practically, TAM can be extended to various domains where graph data is prevalent and where understanding node-anomaly correlations is critical, such as in social networks, financial fraud detection, and cybersecurity.

Theoretically, future work can explore enhancing TAM by integrating it with heterophily-aware mechanisms, addressing potential performance drop in graphs with strong heterophily relations within normal nodes. The approach's reliance on affinity maximization could serve as a versatile framework for adapting to different types of anomaly detection tasks beyond graphs.

In terms of scalability, while TAM demonstrates competitive performance on large-scale datasets, potential adaptations could focus on improving computational efficiency for extremely large graphs, perhaps by leveraging distributed or parallel computing methodologies. Furthermore, exploring variant architectures of GNNs within the TAM framework could reveal additional performance insights, potentially incorporating more advanced architecture like graph attention networks (GAT) for nuanced affinity mapping.

In conclusion, this paper makes a crucial step forward in graph anomaly detection by refining the methodological approach to anomaly scoring, tapping into attribute-based affinity maximization as a robust detection measure. It empowers researchers to consider node-level relational dynamics more deeply, setting a precedent for future innovations in the field of unsupervised anomaly detection.

Youtube Logo Streamline Icon: https://streamlinehq.com