The Shape of Money Laundering: Subgraph Representation Learning on the Blockchain with the Elliptic2 Dataset (2404.19109v3)
Abstract: Subgraph representation learning is a technique for analyzing local structures (or shapes) within complex networks. Enabled by recent developments in scalable Graph Neural Networks (GNNs), this approach encodes relational information at a subgroup level (multiple connected nodes) rather than at a node level of abstraction. We posit that certain domain applications, such as anti-money laundering (AML), are inherently subgraph problems and mainstream graph techniques have been operating at a suboptimal level of abstraction. This is due in part to the scarcity of annotated datasets of real-world size and complexity, as well as the lack of software tools for managing subgraph GNN workflows at scale. To enable work in fundamental algorithms as well as domain applications in AML and beyond, we introduce Elliptic2, a large graph dataset containing 122K labeled subgraphs of Bitcoin clusters within a background graph consisting of 49M node clusters and 196M edge transactions. The dataset provides subgraphs known to be linked to illicit activity for learning the set of "shapes" that money laundering exhibits in cryptocurrency and accurately classifying new criminal activity. Along with the dataset we share our graph techniques, software tooling, promising early experimental results, and new domain insights already gleaned from this approach. Taken together, we find immediate practical value in this approach and the potential for a new standard in anti-money laundering and forensic analytics in cryptocurrencies and other financial networks.
- Sub2Vec: Feature Learning for Subgraphs. In PAKDD.
- Subgraph Neural Networks. Proceedings of Neural Information Processing Systems, NeurIPS (2020).
- Evaluating user privacy in bitcoin. In International Conference on Financial Cryptography and Data Security. Springer, 34–51.
- Vijay Prakash Dwivedi and Xavier Bresson. 2021. A Generalization of Transformer Networks to Graphs. In AAAI Workshop on Deep Learning on Graphs: Methods and Applications.
- Benchmarking Graph Neural Networks. Preprint arXiv:2003.00982.
- Long Range Graph Benchmark. In NeurIPS Track on Datasets and Benchmarks.
- Elliptic. 2023. Five ways coin swap services facilitate money laundering and sanctions evasion. https://www.elliptic.co/blog/five-ways-coin-swap-services-facilitate-money-laundering-and-sanctions-evasion.
- Europol. 2023. Bitzlato: senior management arrested. https://www.europol.europa.eu/media-press/newsroom/news/bitzlato-senior-management-arrested.
- Martin Harrigan and Christoph Fretter. 2016. The unreasonable effectiveness of address clustering. In 2016 Intl IEEE Conferences on Ubiquitous Intelligence & Computing, Advanced and Trusted Computing, Scalable Computing and Communications, Cloud and Big Data Computing, Internet of People, and Smart World Congress. IEEE, 368–373.
- OGB-LSC: A Large-Scale Challenge for Machine Learning on Graphs. In NeurIPS Datasets and Benchmarks Track.
- Open Graph Benchmark: Datasets for Machine Learning on Graphs. Preprint arXiv:2005.00687.
- Communication-Efficient Graph Neural Networks with Probabilistic Neighborhood Expansion Analysis and Caching. Proceedings of Machine Learning and Systems 5 (2023).
- Accelerating training and inference of graph neural networks with fast sampling and pipelining. Proceedings of Machine Learning and Systems 4 (2022), 172–189.
- A fistful of bitcoins: characterizing payments among men with no names. In Proceedings of the 2013 conference on Internet measurement conference. 127–140.
- Malte Möser and Arvind Narayanan. 2022. Resurrecting Address Clustering in Bitcoin. In Financial Cryptography and Data Security - 26th International Conference, FC 2022, Revised Selected Papers. Springer Science and Business Media Deutschland GmbH, 386–403. https://doi.org/10.1007/978-3-031-18283-9_19
- Satoshi Nakamoto. 2008. Bitcoin: A peer-to-peer electronic cash system. (2008).
- United States of America v. Virtual Currency Accounts. 2020. AFFIDAVIT IN SUPPORT OF ISSUANCE OF WARRANT OF ARREST IN REM. https://assetforfeiturelaw.us/wp-content/uploads/2020/08/113-Virtual-Currency-Accounts-Affidavit.pdf.
- Fergal Reid and Martin Harrigan. 2013. An analysis of anonymity in the bitcoin system. In Security and privacy in social networks. Springer, 197–223.
- Elliptic Typologies Report. 2023. https://www.elliptic.co/resources/elliptic-typologies-report-2023.
- Dorit Ron and Adi Shamir. 2013. Quantitative analysis of the full bitcoin transaction graph. In International Conference on Financial Cryptography and Data Security. Springer, 6–24.
- Elliptic Data Set. 2019. https://www.kaggle.com/ellipticco/elliptic-data-set.
- Xiyuan Wang and Muhan Zhang. 2022. GLASS: GNN with Labeling Tricks for Subgraph Representation Learning. In International Conference on Learning Representations.
- Anti-Money Laundering in Bitcoin: Experimenting with Graph Convolutional Networks for Financial Forensics. arXiv:1908.02591 [cs.SI]
- Graph Neural Networks: A Review of Methods and Applications. arXiv:1812.08434 [cs.LG]