Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

The Effects of Data Imbalance Under a Federated Learning Approach for Credit Risk Forecasting (2401.07234v1)

Published 14 Jan 2024 in cs.LG and cs.AI

Abstract: Credit risk forecasting plays a crucial role for commercial banks and other financial institutions in granting loans to customers and minimise the potential loss. However, traditional machine learning methods require the sharing of sensitive client information with an external server to build a global model, potentially posing a risk of security threats and privacy leakage. A newly developed privacy-preserving distributed machine learning technique known as Federated Learning (FL) allows the training of a global model without the necessity of accessing private local data directly. This investigation examined the feasibility of federated learning in credit risk assessment and showed the effects of data imbalance on model performance. Two neural network architectures, Multilayer Perceptron (MLP) and Long Short-Term Memory (LSTM), and one tree ensemble architecture, Extreme Gradient Boosting (XGBoost), were explored across three different datasets under various scenarios involving different numbers of clients and data distribution configurations. We demonstrate that federated models consistently outperform local models on non-dominant clients with smaller datasets. This trend is especially pronounced in highly imbalanced data scenarios, yielding a remarkable average improvement of 17.92% in model performance. However, for dominant clients (clients with more data), federated models may not exhibit superior performance, suggesting the need for special incentives for this type of clients to encourage their participation.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (51)
  1. Neural networks in business: techniques and applications for the operations researcher. Computers & Operations Research, 27(11):1023–1044, 2000.
  2. Loan default prediction using decision trees and random forest: A comparative study. In IOP Conference Series: Materials Science and Engineering, volume 1022, page 012042. IOP Publishing, 2021.
  3. Phab scores: proportional hazards analysis behavioural scores. Journal of the Operational Research Society, 52(9):1007–1016, 2001.
  4. Erkki K Laitinen. Predicting a corporate credit analyst’s risk estimate by logistic and linear models. International Review of Financial Analysis, 8(2):97–121, 1999.
  5. A credit scoring model for personal loans. Insurance: Mathematics & Economics, 8(1):31–34, 1989.
  6. Application of deep neural networks to assess corporate credit rating. arXiv preprint arXiv:2003.02334, 2020.
  7. A deep learning approach for credit scoring of peer-to-peer lending using attention mechanism lstm. IEEE Access, 7:2161–2168, 2018.
  8. A vertical federated learning method for interpretable scorecard and its application in credit scoring. CoRR, abs/2009.06218, 2020.
  9. Leveraging asynchronous federated learning to predict customers financial distress. Intelligent Systems with Applications, 14:200064, 2022.
  10. Research and practice of financial credit risk management based on federated learning. Engineering Letters, 31(1), 2023.
  11. A privacy-preserving decentralized credit scoring method based on multi-party information. Decision Support Systems, 166:113910, 2023.
  12. Communication-efficient learning of deep networks from decentralized data. Artificial intelligence and statistics, pages 1273–1282, 2017.
  13. Advances and open problems in federated learning. Foundations and Trends® in Machine Learning, 14(1–2):1–210, 2021.
  14. Fedboost: A communication-efficient algorithm for federated learning. International Conference on Machine Learning, pages 3973–3983, 2020.
  15. Tackling the objective inconsistency problem in heterogeneous federated optimization. Advances in neural information processing systems, 33:7611–7623, 2020.
  16. Federated optimization in heterogeneous networks. Proceedings of Machine learning and systems, 2:429–450, 2020.
  17. Fetchsgd: Communication-efficient federated learning with sketching. International Conference on Machine Learning, pages 8253–8265, 2020.
  18. Federated machine learning: Concept and applications. ACM Transactions on Intelligent Systems and Technology (TIST), 10(2):1–19, 2019.
  19. Federated learning of deep networks using model averaging. arXiv preprint arXiv:1602.05629, 2, 2016.
  20. A review of applications in federated learning. Computers & Industrial Engineering, 149:106854, 2020.
  21. Fate: An industrial grade platform for collaborative learning with data protection. The Journal of Machine Learning Research, 22(1):10320–10325, 2021.
  22. Federated transfer learning: concept and applications. CoRR, abs/2010.15561, 2020.
  23. Federated learning review: Fundamentals, enabling technologies, and future applications. Information processing & management, 59(6):103061, 2022.
  24. On the convergence of FedAvg on non-IID data. arXiv preprint arXiv:1907.02189, 2019.
  25. Federated learning: Strategies for improving communication efficiency. arXiv preprint arXiv:1610.05492, 2016.
  26. Gaia: Geo-distributed machine learning approaching LAN speeds. NSDI, pages 629–647, 2017.
  27. CMFL: Mitigating communication overhead for federated learning. 2019 IEEE 39th international conference on distributed computing systems (ICDCS), pages 954–964, 2019.
  28. Distributed sensing using smart end-user devices: Pathway to federated learning for autonomous IoT. 2019 International conference on computational science and computational intelligence (CSCI), pages 1156–1161, 2019.
  29. Multiagent DDPG-based deep learning for smart ocean federated learning IoT networks. IEEE Internet of Things Journal, 7(10):9895–9903, 2020.
  30. Federated learning with blockchain for autonomous vehicles: Analysis and design challenges. IEEE Transactions on Communications, 68(8):4734–4746, 2020.
  31. Federated and secure cloud services for building medical image classifiers on an intercontinental infrastructure. Future Generation Computer Systems, 110:119–134, 2020.
  32. Deep federated learning for IoT-based decentralized healthcare systems. 2021 International Wireless Communications and Mobile Computing (IWCMC), pages 105–109, 2021.
  33. Ffd: A federated learning based method for credit card fraud detection. Big Data–BigData 2019: 8th International Congress, Held as Part of the Services Conference Federation, SCF 2019, San Diego, CA, USA, June 25–30, 2019, Proceedings 8, pages 18–32, 2019.
  34. Credit risk assessment from combined bank records using federated learning. International Research Journal of Engineering and Technology (IRJET), 6(4):1355–1358, 2019.
  35. FedAI. Utilization of fate in risk management of credit in small and micro enterprises. Accessed on 5th June 2023.
  36. Yusuf Efe. A vertical federated learning method for multi-institutional credit scoring: Mics. arXiv preprint arXiv:2111.09038, 2021.
  37. Label-efficient self-supervised federated learning for tackling data heterogeneity in medical imaging. IEEE Transactions on Medical Imaging, 2023.
  38. A precision-centric approach to overcoming data imbalance and non-IIDness in federated learning. Internet of Things, page 100890, 2023.
  39. FedFa: federated learning with feature anchors to align feature and classifier for heterogeneous data. arXiv preprint arXiv:2211.09299, 2022.
  40. FedISM: Enhancing data imbalance via shared model in federated learning. Mathematics, 11(10):2385, 2023.
  41. Adaptive asynchronous federated learning in resource-constrained edge computing. IEEE Transactions on Mobile Computing, 2021.
  42. Federated learning for covid-19 screening from chest x-ray images. Applied Soft Computing, 106:107330, 2021.
  43. An experimental study of data heterogeneity in federated learning methods for medical imaging. arXiv preprint arXiv:2107.08371, 2021.
  44. FedRich: Towards efficient federated learning for heterogeneous clients using heuristic scheduling. Information Sciences, 645:119360, 2023.
  45. Experimenting with normalization layers in federated learning on non-IID scenarios. arXiv preprint arXiv:2303.10630, 2023.
  46. Predicting firms’ credit ratings using ensembles of artificial immune systems and machine learning–an over-sampling approach. Artificial Intelligence Applications and Innovations: 10th IFIP WG 12.5 International Conference, AIAI 2014, Rhodes, Greece, September 19-21, 2014. Proceedings 10, pages 29–38, 2014.
  47. Flower: A friendly federated learning research framework. arXiv preprint arXiv:2007.14390, 2020.
  48. Gradient-less federated gradient boosting tree with learnable learning rates. Proceedings of the 3rd Workshop on Machine Learning and Systems, pages 56–63, 2023.
  49. Utilizing historical data for corporate credit rating assessment. Expert Systems with Applications, 165:113925, 2021.
  50. Home credit default risk, 2018.
  51. Will Cukierski Credit Fusion. Give me some credit, 2011.

Summary

We haven't generated a summary for this paper yet.