Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Synthetic Demographic Data Generation for Card Fraud Detection Using GANs (2306.17109v1)

Published 29 Jun 2023 in cs.LG and cs.AI

Abstract: Using machine learning models to generate synthetic data has become common in many fields. Technology to generate synthetic transactions that can be used to detect fraud is also growing fast. Generally, this synthetic data contains only information about the transaction, such as the time, place, and amount of money. It does not usually contain the individual user's characteristics (age and gender are occasionally included). Using relatively complex synthetic demographic data may improve the complexity of transaction data features, thus improving the fraud detection performance. Benefiting from developments of machine learning, some deep learning models have potential to perform better than other well-established synthetic data generation methods, such as microsimulation. In this study, we built a deep-learning Generative Adversarial Network (GAN), called DGGAN, which will be used for demographic data generation. Our model generates samples during model training, which we found important to overcame class imbalance issues. This study can help improve the cognition of synthetic data and further explore the application of synthetic data generation in card fraud detection.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (10)
  1. Banksformer: A deep generative model for synthetic transaction sequences. 2022.
  2. Paysim: A financial mobile money simulator for fraud detection. In 28th European Modeling and Simulation Symposium, EMSS, Larnaca, pages 249–255. Dime University of Genoa, 2016.
  3. Generative adversarial networks. 63(11):139–144.
  4. RH Griffin. 120 years of olympic history: athletes and results, 2018.
  5. Barry G Becker. Visualizing decision table classifiers. In Proceedings IEEE symposium on information visualization (Cat. No. 98TB100258), pages 102–105. IEEE, 1998.
  6. DataCebo, Inc. Synthetic Data Metrics, 10 2022. Version 0.8.0.
  7. Joel Grus. Data science from scratch: first principles with python. O’Reilly Media, 2019.
  8. Modeling tabular data using conditional gan. Advances in Neural Information Processing Systems, 32, 2019.
  9. Generalized cross entropy loss for training deep neural networks with noisy labels. Advances in neural information processing systems, 31, 2018.
  10. Zijun Zhang. Improved adam optimizer for deep neural networks. In 2018 IEEE/ACM 26th international symposium on quality of service (IWQoS), pages 1–2. Ieee, 2018.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Shuo Wang (382 papers)
  2. Terrence Tricco (3 papers)
  3. Xianta Jiang (10 papers)
  4. Charles Robertson (1 paper)
  5. John Hawkin (3 papers)
Citations (1)