Papers
Topics
Authors
Recent
2000 character limit reached

Towards a Foundation Purchasing Model: Pretrained Generative Autoregression on Transaction Sequences

Published 3 Jan 2024 in cs.LG | (2401.01641v2)

Abstract: Machine learning models underpin many modern financial systems for use cases such as fraud detection and churn prediction. Most are based on supervised learning with hand-engineered features, which relies heavily on the availability of labelled data. Large self-supervised generative models have shown tremendous success in natural language processing and computer vision, yet so far they haven't been adapted to multivariate time series of financial transactions. In this paper, we present a generative pretraining method that can be used to obtain contextualised embeddings of financial transactions. Benchmarks on public datasets demonstrate that it outperforms state-of-the-art self-supervised methods on a range of downstream tasks. We additionally perform large-scale pretraining of an embedding model using a corpus of data from 180 issuing banks containing 5.1 billion transactions and apply it to the card fraud detection problem on hold-out datasets. The embedding model significantly improves value detection rate at high precision thresholds and transfers well to out-of-domain distributions.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (38)
  1. Sercan O. Arik and Tomas Pfister. 2020. TabNet: Attentive Interpretable Tabular Learning. arXiv:1908.07442 https://arxiv.org/abs/1908.07442
  2. CoLES: Contrastive Learning for Event Sequences with Self-Supervision (SIGMOD ’22). Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3514221.3526129
  3. wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations. In Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin (Eds.), Vol. 33. Curran Associates, Inc., 12449–12460.
  4. Language Models are Few-Shot Learners. In Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin (Eds.), Vol. 33. Curran Associates, Inc., 1877–1901. https://proceedings.neurips.cc/paper_files/paper/2020/file/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf
  5. DeepTrax: Embedding Graphs of Financial Transactions. arXiv:1907.07225 [cs.LG] https://arxiv.org/abs/1907.07225
  6. LaundroGraph: Self-Supervised Graph Representation Learning for Anti-Money Laundering. In Proceedings of the Third ACM International Conference on AI in Finance (New York, NY, USA) (ICAIF ’22). Association for Computing Machinery, New York, NY, USA, 130–138. https://doi.org/10.1145/3533271.3561727
  7. Unsupervised Learning of Visual Features by Contrasting Cluster Assignments. In Proceedings of the 34th International Conference on Neural Information Processing Systems (Vancouver, BC, Canada) (NIPS’20). Curran Associates Inc., Red Hook, NY, USA, Article 831, 13 pages.
  8. A Simple Framework for Contrastive Learning of Visual Representations. arXiv:2002.05709 https://arxiv.org/abs/2002.05709
  9. Xinlei Chen and Kaiming He. 2021. Exploring Simple Siamese Representation Learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 15750–15758.
  10. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, Doha, Qatar, 1724–1734. https://doi.org/10.3115/v1/D14-1179
  11. ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators. In International Conference on Learning Representations. https://openreview.net/forum?id=r1xMH1BtvB
  12. Unsupervised Cross-lingual Representation Learning for Speech Recognition. arXiv:2006.13979 https://arxiv.org/abs/2006.13979
  13. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv:1810.04805 https://arxiv.org/abs/1810.04805
  14. International Organization for Standardization. [n. d.]. ISO 8583-1:2003 Financial transaction card originated messages – Interchange message specifications – Part 1: Messages, data elements and code values. https://www.iso.org/standard/31628.html
  15. SimCSE: Simple Contrastive Learning of Sentence Embeddings. In Empirical Methods in Natural Language Processing (EMNLP). https://arxiv.org/abs/2104.08821
  16. Navigating the Dynamics of Financial Embeddings over Time. arXiv:2007.00591 https://arxiv.org/abs/2007.00591
  17. CuRL: Coupled Representation Learning of Cards and Merchants to Detect Transaction Frauds. In Artificial Neural Networks and Machine Learning – ICANN 2021: 30th International Conference on Artificial Neural Networks, Bratislava, Slovakia, September 14–17, 2021, Proceedings, Part V (Bratislava, Slovakia). Springer-Verlag, Berlin, Heidelberg, 16–29. https://doi.org/10.1007/978-3-030-86383-8_2
  18. Bootstrap Your Own Latent - A New Approach to Self-Supervised Learning. In Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin (Eds.), Vol. 33. Curran Associates, Inc., 21271–21284. https://proceedings.neurips.cc/paper_files/paper/2020/file/f3ada80d5c4ee70142b17b8192b2958e-Paper.pdf
  19. Inductive Representation Learning on Large Graphs. In Advances in Neural Information Processing Systems, I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.), Vol. 30. Curran Associates, Inc. https://proceedings.neurips.cc/paper_files/paper/2017/file/5dd9db5e033da9c6fb5ba83c7a7ebea9-Paper.pdf
  20. TabTransformer: Tabular Data Modeling Using Contextual Embeddings. arXiv:2012.06678 https://arxiv.org/abs/2012.06678
  21. Employing transaction aggregation strategy to detect credit card fraud. Expert Systems with Applications 39, 16 (2012), 12650–12657. https://doi.org/10.1016/j.eswa.2012.05.018
  22. ALBERT: A Lite BERT for Self-supervised Learning of Language Representations. In International Conference on Learning Representations. https://openreview.net/forum?id=H1eA7AEtvS
  23. A graph-based, semi-supervised, credit card fraud detection system. In Complex Networks & Their Applications V, Hocine Cherifi, Sabrina Gaito, Walter Quattrociocchi, and Alessandra Sala (Eds.). Springer International Publishing, Cham, 721–733.
  24. Graph Representation Learning of Banking Transaction Network with Edge Weight-Enhanced Attention and Textual Information. In Companion Proceedings of the Web Conference 2022 (Virtual Event, Lyon, France) (WWW ’22). Association for Computing Machinery, New York, NY, USA, 630–637. https://doi.org/10.1145/3487553.3524643
  25. Graph Analytics for Real-Time Scoring of Cross-Channel Transactional Fraud. In Financial Cryptography and Data Security, Jens Grossklags and Bart Preneel (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 22–40.
  26. Context Encoders: Feature Learning by Inpainting. In Computer Vision and Pattern Recognition (CVPR). https://doi.org/10.1109/CVPR.2016.278
  27. Alec Radford and Ilya Sutskever. 2018. Improving Language Understanding by Generative Pre-Training. In arxiv. https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf
  28. Language Models are Unsupervised Multitask Learners. (2019). https://d4mucfpksywv.cloudfront.net/better-language-models/language-models.pdf
  29. SAINT: Improved Neural Networks for Tabular Data via Row Attention and Contrastive Pre-Training. https://openreview.net/forum?id=nL2lDlsrZU
  30. Selfie: Self-supervised Pretraining for Image Embedding. arXiv:1906.02940 https://arxiv.org/abs/1906.02940
  31. SubTab: Subsetting Features of Tabular Data for Self-Supervised Representation Learning. In Advances in Neural Information Processing Systems, M. Ranzato, A. Beygelzimer, Y. Dauphin, P.S. Liang, and J. Wortman Vaughan (Eds.), Vol. 34. Curran Associates, Inc., 18853–18865.
  32. Representation Learning in Graphs for Credit Card Fraud Detection. In Mining Data for Financial Applications, Valerio Bitetta, Ilaria Bordino, Andrea Ferretti, Francesco Gullo, Stefano Pascolutti, and Giovanni Ponti (Eds.). Springer International Publishing, Cham, 32–46.
  33. Inductive Graph Representation Learning for fraud detection. Expert Systems with Applications 193 (2022), 116463. https://doi.org/10.1016/j.eswa.2021.116463
  34. Representation Learning with Contrastive Predictive Coding. arXiv:1807.03748 https://arxiv.org/abs/1807.03748
  35. CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features. In International Conference on Computer Vision (ICCV).
  36. Barlow Twins: Self-Supervised Learning via Redundancy Reduction. In Proceedings of the 38th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 139), Marina Meila and Tong Zhang (Eds.). PMLR, 12310–12320. https://proceedings.mlr.press/v139/zbontar21a.html
  37. mixup: Beyond Empirical Risk Minimization. In International Conference on Learning Representations. https://openreview.net/forum?id=r1Ddp1-Rb
  38. TiCo: Transformation Invariance and Covariance Contrast for Self-Supervised Visual Representation Learning. arXiv:2206.10698 [cs.CV]
Citations (2)

Summary

We haven't generated a summary for this paper yet.

Whiteboard

Paper to Video (Beta)

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.