SubAnom: Efficient Subgraph Anomaly Detection Framework over Dynamic Graphs (2312.10504v1)
Abstract: Given a dynamic graph, the efficient tracking of anomalous subgraphs via their node embeddings poses a significant challenge. Addressing this issue necessitates an effective scoring mechanism and an innovative anomalous subgraph strategy. Existing methods predominantly focus on designing scoring strategies or employing graph structures that consider nodes in isolation, resulting in ineffective capture of the anomalous subgraph structure information. In this paper, we introduce SUBANOM, a novel framework for subgraph anomaly detection that is adept at identifying anomalous subgraphs. SUBANOM has three key components: 1) We implement current state-of-the-art dynamic embedding methods to efficiently calculate node embeddings, thereby capturing all node-level anomalies successfully; 2) We devise novel subgraph identification strategies, which include k-hop and triadic-closure. These strategies form the crucial component that can proficiently differentiate between strong and weak neighbors, thus effectively capturing the anomaly structure information; 3) For qualifying the anomaly subgraphs, we propose using Lp-norm-based score aggregation functions. These iterative steps enable us to process large-scale dynamic graphs effectively. Experiments conducted on a real-world dynamic graph underscore the efficacy of our framework in detecting anomalous subgraphs, outperforming state-of-the-art methods. Experimental results further signify that our framework is a potent tool for identifying anomalous subgraphs in real-world scenarios. For instance, the F1 score under the optimal subgraph identification strategy, can peak at 0.6679, while the highest achievable score using the corresponding baseline method is 0.5677.
- Event detection in activity networks. In KDD, pages 1176–1185, 2014.
- Scalable anomaly ranking of attributed neighborhoods. In SDM, pages 207–215. SIAM, 2016.
- Fraudar: Bounding graph fraud in the face of camouflage. In KDD, pages 895–904, 2016.
- Subgraph detection using eigenvector l1 norms. NeurIPS, 23, 2010.
- A spectral framework for anomalous subgraph detection. IEEE Transactions on Signal Processing, 63(16):4191–4206, 2015.
- Subset node anomaly tracking over large dynamic graphs. In KDD, pages 475–485, 2022.
- Graph-structured sparse optimization for connected subgraph detection. In ICDM, pages 709–718. IEEE, 2016.
- A generic framework for interesting subspace cluster detection in multi-attributed networks. In ICDM, pages 41–50. IEEE, 2017.
- Subset node representation learning over large dynamic graphs. In KDD, pages 516–526, 2021.
- Accelerating dynamic network embedding with billions of parameter updates to milliseconds. In KDD, 2023.
- Outlier detection in graph streams. In ICDE, pages 399–409. IEEE, 2011.
- Copycatch: stopping group attacks by spotting lockstep behavior in social networks. In WWW, pages 119–130, 2013.
- Sedanspot: Detecting anomalies in edge streams. In ICDM, pages 953–958. IEEE, 2018.
- Fast and accurate anomaly detection in dynamic graphs with a two-pronged approach. In KDD, pages 647–657, 2019.
- Densealert: Incremental dense-subtensor detection in tensor streams. In KDD, pages 1057–1066, 2017.
- Midas: Microcluster-based detector of anomalies in edge streams. In AAAI, volume 34, pages 3242–3249, 2020.
- A scalable approach for outlier detection in edge streams using sketch-based approximations. In SDM, pages 189–197. SIAM, 2016.
- F-fade: Frequency factorization for anomaly detection in edge streams. In WSDM, pages 589–597, 2021.
- Spotlight: Detecting anomalies in streaming graphs. In KDD, pages 1378–1386, 2018.
- Representation learning for dynamic graphs: A survey. JMLR, 21(1):2648–2720, 2020.
- Self-supervised representation learning on dynamic graphs. In CIKM, pages 1814–1823, 2021.
- Networks, crowds, and markets: Reasoning about a highly connected world. Cambridge university press, 2010.
- Markov logic networks. Machine learning, 62:107–136, 2006.
- Problog: A probabilistic prolog and its application in link discovery. In IJCAI, volume 7, pages 2462–2467. Hyderabad, 2007.
- A short introduction to probabilistic soft logic. In Proceedings of the NIPS workshop on probabilistic programming: foundations and applications, pages 1–4, 2012.
- Relational logistic regression. In Fourteenth International Conference on the Principles of Knowledge Representation and Reasoning, 2014.
- Approximate personalized PageRank on dynamic graphs. In KDD, pages 1315–1324, 2016.
- Instantembedding: Efficient local node representations. arXiv preprint arXiv:2010.06992, 2020.
- Mark S Granovetter. The strength of weak ties. American journal of sociology, 78(6):1360–1380, 1973.
- The local closure coefficient: A new perspective on network clustering. In WSDM, pages 303–311, 2019.
- Measuring directed triadic closure with closure coefficients. Network Science, 8(4):551–573, 2020.
- The 1999 darpa off-line intrusion detection evaluation. Computer networks, 34(4):579–595, 2000.