Towards a Foundation Purchasing Model: Pretrained Generative Autoregression on Transaction Sequences
Abstract: Machine learning models underpin many modern financial systems for use cases such as fraud detection and churn prediction. Most are based on supervised learning with hand-engineered features, which relies heavily on the availability of labelled data. Large self-supervised generative models have shown tremendous success in natural language processing and computer vision, yet so far they haven't been adapted to multivariate time series of financial transactions. In this paper, we present a generative pretraining method that can be used to obtain contextualised embeddings of financial transactions. Benchmarks on public datasets demonstrate that it outperforms state-of-the-art self-supervised methods on a range of downstream tasks. We additionally perform large-scale pretraining of an embedding model using a corpus of data from 180 issuing banks containing 5.1 billion transactions and apply it to the card fraud detection problem on hold-out datasets. The embedding model significantly improves value detection rate at high precision thresholds and transfers well to out-of-domain distributions.
- Sercan O. Arik and Tomas Pfister. 2020. TabNet: Attentive Interpretable Tabular Learning. arXiv:1908.07442 https://arxiv.org/abs/1908.07442
- CoLES: Contrastive Learning for Event Sequences with Self-Supervision (SIGMOD ’22). Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3514221.3526129
- wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations. In Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin (Eds.), Vol. 33. Curran Associates, Inc., 12449–12460.
- Language Models are Few-Shot Learners. In Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin (Eds.), Vol. 33. Curran Associates, Inc., 1877–1901. https://proceedings.neurips.cc/paper_files/paper/2020/file/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf
- DeepTrax: Embedding Graphs of Financial Transactions. arXiv:1907.07225Â [cs.LG] https://arxiv.org/abs/1907.07225
- LaundroGraph: Self-Supervised Graph Representation Learning for Anti-Money Laundering. In Proceedings of the Third ACM International Conference on AI in Finance (New York, NY, USA) (ICAIF ’22). Association for Computing Machinery, New York, NY, USA, 130–138. https://doi.org/10.1145/3533271.3561727
- Unsupervised Learning of Visual Features by Contrasting Cluster Assignments. In Proceedings of the 34th International Conference on Neural Information Processing Systems (Vancouver, BC, Canada) (NIPS’20). Curran Associates Inc., Red Hook, NY, USA, Article 831, 13 pages.
- A Simple Framework for Contrastive Learning of Visual Representations. arXiv:2002.05709 https://arxiv.org/abs/2002.05709
- Xinlei Chen and Kaiming He. 2021. Exploring Simple Siamese Representation Learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 15750–15758.
- Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, Doha, Qatar, 1724–1734. https://doi.org/10.3115/v1/D14-1179
- ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators. In International Conference on Learning Representations. https://openreview.net/forum?id=r1xMH1BtvB
- Unsupervised Cross-lingual Representation Learning for Speech Recognition. arXiv:2006.13979 https://arxiv.org/abs/2006.13979
- BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv:1810.04805 https://arxiv.org/abs/1810.04805
- International Organization for Standardization. [n. d.]. ISO 8583-1:2003 Financial transaction card originated messages – Interchange message specifications – Part 1: Messages, data elements and code values. https://www.iso.org/standard/31628.html
- SimCSE: Simple Contrastive Learning of Sentence Embeddings. In Empirical Methods in Natural Language Processing (EMNLP). https://arxiv.org/abs/2104.08821
- Navigating the Dynamics of Financial Embeddings over Time. arXiv:2007.00591 https://arxiv.org/abs/2007.00591
- CuRL: Coupled Representation Learning of Cards and Merchants to Detect Transaction Frauds. In Artificial Neural Networks and Machine Learning – ICANN 2021: 30th International Conference on Artificial Neural Networks, Bratislava, Slovakia, September 14–17, 2021, Proceedings, Part V (Bratislava, Slovakia). Springer-Verlag, Berlin, Heidelberg, 16–29. https://doi.org/10.1007/978-3-030-86383-8_2
- Bootstrap Your Own Latent - A New Approach to Self-Supervised Learning. In Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin (Eds.), Vol. 33. Curran Associates, Inc., 21271–21284. https://proceedings.neurips.cc/paper_files/paper/2020/file/f3ada80d5c4ee70142b17b8192b2958e-Paper.pdf
- Inductive Representation Learning on Large Graphs. In Advances in Neural Information Processing Systems, I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.), Vol. 30. Curran Associates, Inc. https://proceedings.neurips.cc/paper_files/paper/2017/file/5dd9db5e033da9c6fb5ba83c7a7ebea9-Paper.pdf
- TabTransformer: Tabular Data Modeling Using Contextual Embeddings. arXiv:2012.06678 https://arxiv.org/abs/2012.06678
- Employing transaction aggregation strategy to detect credit card fraud. Expert Systems with Applications 39, 16 (2012), 12650–12657. https://doi.org/10.1016/j.eswa.2012.05.018
- ALBERT: A Lite BERT for Self-supervised Learning of Language Representations. In International Conference on Learning Representations. https://openreview.net/forum?id=H1eA7AEtvS
- A graph-based, semi-supervised, credit card fraud detection system. In Complex Networks & Their Applications V, Hocine Cherifi, Sabrina Gaito, Walter Quattrociocchi, and Alessandra Sala (Eds.). Springer International Publishing, Cham, 721–733.
- Graph Representation Learning of Banking Transaction Network with Edge Weight-Enhanced Attention and Textual Information. In Companion Proceedings of the Web Conference 2022 (Virtual Event, Lyon, France) (WWW ’22). Association for Computing Machinery, New York, NY, USA, 630–637. https://doi.org/10.1145/3487553.3524643
- Graph Analytics for Real-Time Scoring of Cross-Channel Transactional Fraud. In Financial Cryptography and Data Security, Jens Grossklags and Bart Preneel (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 22–40.
- Context Encoders: Feature Learning by Inpainting. In Computer Vision and Pattern Recognition (CVPR). https://doi.org/10.1109/CVPR.2016.278
- Alec Radford and Ilya Sutskever. 2018. Improving Language Understanding by Generative Pre-Training. In arxiv. https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf
- Language Models are Unsupervised Multitask Learners. (2019). https://d4mucfpksywv.cloudfront.net/better-language-models/language-models.pdf
- SAINT: Improved Neural Networks for Tabular Data via Row Attention and Contrastive Pre-Training. https://openreview.net/forum?id=nL2lDlsrZU
- Selfie: Self-supervised Pretraining for Image Embedding. arXiv:1906.02940 https://arxiv.org/abs/1906.02940
- SubTab: Subsetting Features of Tabular Data for Self-Supervised Representation Learning. In Advances in Neural Information Processing Systems, M. Ranzato, A. Beygelzimer, Y. Dauphin, P.S. Liang, and J. Wortman Vaughan (Eds.), Vol. 34. Curran Associates, Inc., 18853–18865.
- Representation Learning in Graphs for Credit Card Fraud Detection. In Mining Data for Financial Applications, Valerio Bitetta, Ilaria Bordino, Andrea Ferretti, Francesco Gullo, Stefano Pascolutti, and Giovanni Ponti (Eds.). Springer International Publishing, Cham, 32–46.
- Inductive Graph Representation Learning for fraud detection. Expert Systems with Applications 193 (2022), 116463. https://doi.org/10.1016/j.eswa.2021.116463
- Representation Learning with Contrastive Predictive Coding. arXiv:1807.03748 https://arxiv.org/abs/1807.03748
- CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features. In International Conference on Computer Vision (ICCV).
- Barlow Twins: Self-Supervised Learning via Redundancy Reduction. In Proceedings of the 38th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 139), Marina Meila and Tong Zhang (Eds.). PMLR, 12310–12320. https://proceedings.mlr.press/v139/zbontar21a.html
- mixup: Beyond Empirical Risk Minimization. In International Conference on Learning Representations. https://openreview.net/forum?id=r1Ddp1-Rb
- TiCo: Transformation Invariance and Covariance Contrast for Self-Supervised Visual Representation Learning. arXiv:2206.10698Â [cs.CV]
Sponsor
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.