Starlit: Privacy-Preserving Federated Learning to Enhance Financial Fraud Detection (2401.10765v2)
Abstract: Federated Learning (FL) is a data-minimization approach enabling collaborative model training across diverse clients with local data, avoiding direct data exchange. However, state-of-the-art FL solutions to identify fraudulent financial transactions exhibit a subset of the following limitations. They (1) lack a formal security definition and proof, (2) assume prior freezing of suspicious customers' accounts by financial institutions (limiting the solutions' adoption), (3) scale poorly, involving either $O(n2)$ computationally expensive modular exponentiation (where $n$ is the total number of financial institutions) or highly inefficient fully homomorphic encryption, (4) assume the parties have already completed the identity alignment phase, hence excluding it from the implementation, performance evaluation, and security analysis, and (5) struggle to resist clients' dropouts. This work introduces Starlit, a novel scalable privacy-preserving FL mechanism that overcomes these limitations. It has various applications, such as enhancing financial fraud detection, mitigating terrorism, and enhancing digital health. We implemented Starlit and conducted a thorough performance analysis using synthetic data from a key player in global financial transactions. The evaluation indicates Starlit's scalability, efficiency, and accuracy.
- Aydin Abadi and Steven J. Murdoch. 2023. Payment with Dispute Resolution: A Protocol for Reimbursing Frauds Victims. In ACM Asia CCS.
- A supervised machine learning algorithm for detecting and predicting fraud in credit card transactions. Decision Analytics Journal (2023).
- Information sharing across private databases. In Proceedings of the 2003 ACM SIGMOD international conference on Management of data. 86–97.
- Improving fraud detection mechanism in financial banking sectors using data mining techniques. In Progress in Advanced Computing and Intelligent Engineering.
- Privacy-Preserving Financial Anomaly Detection via Federated Learning & Multi-Party Computation. CoRR abs/2310.04546 (2023).
- Sikdar Md. S. Askari and Md. Anwar Hussain. 2020. IFDTC4.5: Intuitionistic fuzzy logic based decision tree for E-transactional fraud detection. J. Inf. Secur. Appl. (2020).
- Financial Conduct Authority. 2021. FCA Glossary. https://www.handbook.fca.org.uk/handbook/glossary/G3566a.html.
- Modified genetic algorithm with deep learning for fraud transactions of ethereum smart contract. Applied Sciences (2023).
- Jane H. Barnsteiner. 2008. Medication Reconciliation. https://www.ncbi.nlm.nih.gov/books/NBK2648/.
- Tom Bergin and Nathan Layne. 2016. Special Report: Cyber thieves exploit banks’ faith in SWIFT transfer network. Reuters (2016).
- Flower: A Friendly Federated Learning Research Framework. CoRR (2020).
- Practical Secure Aggregation for Privacy-Preserving Machine Learning. In CCS. ACM.
- Joseph Bonneau and Sören Preibusch. 2010. The Password Thicket: Technical and Market Failures in Human Authentication on the Web.. In WEIS.
- Longbing Cao. 2022. AI in finance: challenges, techniques, and opportunities. ACM CSUR (2022).
- SplitNN-driven vertical partitioning. arXiv preprint arXiv:2008.04137 (2020).
- Tianqi Chen and Carlos Guestrin. 2016. XGBoost: A Scalable Tree Boosting System. In ACM SIGKDD.
- Exploiting blockchain data to detect smart ponzi schemes on ethereum. IEEE Access (2019).
- Secureboost: A lossless federated learning framework. IEEE Intelligent Systems (2021).
- Confirmation of Payee Team. 2020. Confirmation of Payee- Response to consultation CP20/1 and decision on varying Specific Direction 10. (2020). https://t.ly/xiJQM.
- Local privacy, data processing inequalities, and statistical minimax rates. arXiv:1302.3203 (2013).
- Calibrating Noise to Sensitivity in Private Data Analysis. In Theory of Cryptography.
- Large-scale secure XGB for vertical federated learning. In ACM CIKM.
- Privacy and security in the era of digital health: what should translational researchers know and do about it? American journal of translational research (2016).
- Oded Goldreich. 2004. The Foundations of Cryptography - Volume 2, Basic Applications. Cambridge University Press.
- A Decentralized Information Marketplace Preserving Input and Output Privacy. (2023).
- UK Government. 2023. Benefit fraud. https://www.gov.uk/benefit-fraud.
- Private federated learning on vertically partitioned data via entity resolution and additively homomorphic encryption. arXiv preprint arXiv:1711.10677 (2017).
- Anomaly detection in blockchain networks: A comprehensive survey. IEEE Communications Surveys & Tutorials (2022).
- Smartwatches in healthcare medicine: assistance and monitoring; a scoping review. BMC Medical Informatics Decis. Mak. (2023). https://doi.org/10.1186/s12911-023-02350-w
- Charlie Jacomme and Steve Kremer. 2021. An extensive formal analysis of multi-factor authentication protocols. ACM Transactions on Privacy and Security (TOPS) (2021).
- Data mining-based Ethereum fraud detection. In IEEE International Conference on Blockchain.
- Privacy-Preserving Federated Learning over Vertically and Horizontally Partitioned Data for Financial Anomaly Detection. CoRR abs/2310.19304 (2023).
- Extremal mechanisms for local differential privacy. Advances in neural information processing systems (2014).
- KRAKEN: a privacy-preserving data market for authentic data. In DE. ACM.
- Vladimir Kolesnikov and Ranjit Kumaresan. 2013. Improved OT Extension for Transferring Short Secrets. In CRYPTO.
- Efficient Batched Oblivious PRF with Applications to Private Set Intersection. In CCS.
- Federated Optimization in Heterogeneous Networks. In MLSys.
- Asymmetrical vertical federated learning. arXiv preprint arXiv:2004.07427 (2020).
- Research on Modeling of E-banking Fraud Account Identification Based on Federated Learning. In IEEE Intl Conf on Dependable, Autonomic and Secure Computing, Intl Conf on Pervasive Intelligence and Computing, Intl Conf on Cloud and Big Data Computing, Intl Conf on Cyber Science and Technology Congress, DASC/PiCom/CBDCom/CyberSciTech 2021, Canada, October 25-28, 2021. IEEE.
- Communication-efficient learning of deep networks from decentralized data. In Artificial intelligence and statistics. PMLR.
- Communication-Efficient Learning of Deep Networks from Decentralized Data. In AISTATS.
- Federated Learning of Deep Networks using Model Averaging. CoRR abs/1602.05629 (2016).
- Steven J Murdoch and Aydin Abadi. 2022. A Forward-secure Efficient Two-factor Authentication Protocol. arXiv preprint arXiv:2208.02877 (2022).
- Federated Learning for Smart Healthcare: A Survey. ACM Comput. Surv. (2023).
- Pascal Paillier. 1999. Public-Key Cryptosystems Based on Composite Degree Residuosity Classes. In EUROCRYPT. 223–238.
- Efficient Vertical Federated Learning with Secure Aggregation. CoRR (2023).
- Adaptive Federated Optimization. CoRR (2020).
- Reuters. 2023. What is known about latest leak of U.S. secrets. https://www.reuters.com/world/us/what-is-known-about-latest-leak-us-secrets-2023-04-10/.
- Pyvertical: A vertical federated learning framework for multi-headed splitnn. arXiv preprint arXiv:2104.00489 (2021).
- Big Data Analytics for Credit Card Fraud Detection Using Supervised Machine Learning Models. In Big Data Analytics in the Insurance Market.
- Reza Shokri. 2014. Privacy games: Optimal user-centric data obfuscation. arXiv preprint arXiv:1402.3426 (2014).
- Protecting location privacy: optimal strategy against localization attacks. In ACM CCS.
- A survey on multi-factor authentication for online banking in the wild. Computers & Security (2020).
- Machine-Learning-Based Scoring System for Antifraud CISIRTs in Banking Environment. Electronics (2023).
- Vertical federated learning without revealing intersection membership. arXiv preprint arXiv:2106.05508 (2021).
- Towards Federated Graph Learning for Collaborative Financial Crimes Detection. CoRR (2019).
- The Royal Society. [n. d.]. From privacy to partnership: The role of privacy enhancing technologies in data governance and collaborative analysis. https://royalsociety.org/-/media/policy/projects/privacy-enhancing-technologies/From-Privacy-to-Partnership.pdf?la=en-GB&hash=4769FEB5C984089FAB52FE7E22F379D6.
- The UK and US Goverments. 2022. The UK and US governments, UK-US prize challenges Accelerating the adoption and development of privacy-enhancing technologies. https://tinyurl.com/4ntm2xv2
- Private Data Valuation and Fair Payment in Data Marketplaces. CoRR (2022).
- FederBoost: Private federated learning for GBDT. arXiv preprint arXiv:2011.02796 (2020).
- UK Home Office and Ministry of Justice. 2023. Data Sharing for the Criminal Justice System Guidance. https://assets.publishing.service.gov.uk/media/652cefa56b6fbf000db7567a/data-sharing-guidance-criminal-justice-system.pdf.
- Privacy-preservation for gradient descent methods. In Proceedings of the 13th ACM KDD. ACM.
- Using Randomized Response for Differential Privacy Preserving Data Collection.. In EDBT/ICDT Workshops, Vol. 1558. 0090–6778.
- Financial data unbound: The value of open data for individuals and institutions. McKinsey Global Institute (2021).
- Davey Winder. 2023. This Is How Hackers Accessed 34,942 PayPal Accounts. Forbes (2023).
- FadMan: Federated Anomaly Detection across Multiple Attributed Networks. CoRR (2022).
- Xiuguo Wu and Shengyong Du. 2022. An Optimized Association Rules Mining Framework and its Application in Chinese Social Insurance Fund Data Auditing. SSRN (2022).
- Privacy preserving vertical federated learning for tree-based models. arXiv preprint arXiv:2008.06170 (2020).
- Wu Xiuguo and Du Shengyong. 2022. An analysis on financial statement fraud detection for Chinese listed companies using deep learning. IEEE Access 10 (2022), 22516–22532.
- Fedv: Privacy-preserving federated learning over vertically partitioned data. In Runhua.
- Federated Machine Learning: Concept and Applications. ACM Trans. Intell. Syst. Technol. (2019).
- FFD: A federated learning based method for credit card fraud detection. In Big Data.
- Andrew Chi-Chih Yao. 1982. Protocols for Secure Computations (Extended Abstract). In FSC.
- {{\{{BatchCrypt}}\}}: Efficient homomorphic encryption for {{\{{Cross-Silo}}\}} federated learning. In USENIX ATC.
- Anthony Zurcher. 2023. How Trump, Biden and Clinton secret files cases compare. https://www.bbc.co.uk/news/world-us-canada-64230040.