Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
156 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Credit Risk Meets Large Language Models: Building a Risk Indicator from Loan Descriptions in P2P Lending (2401.16458v2)

Published 29 Jan 2024 in q-fin.RM, cs.AI, cs.CL, and cs.LG

Abstract: Peer-to-peer (P2P) lending has emerged as a distinctive financing mechanism, linking borrowers with lenders through online platforms. However, P2P lending faces the challenge of information asymmetry, as lenders often lack sufficient data to assess the creditworthiness of borrowers. This paper proposes a novel approach to address this issue by leveraging the textual descriptions provided by borrowers during the loan application process. Our methodology involves processing these textual descriptions using a LLM, a powerful tool capable of discerning patterns and semantics within the text. Transfer learning is applied to adapt the LLM to the specific task at hand. Our results derived from the analysis of the Lending Club dataset show that the risk score generated by BERT, a widely used LLM, significantly improves the performance of credit risk classifiers. However, the inherent opacity of LLM-based systems, coupled with uncertainties about potential biases, underscores critical considerations for regulatory frameworks and engenders trust-related concerns among end-users, opening new avenues for future research in the dynamic landscape of P2P lending and artificial intelligence.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (47)
  1. Explainability of a Machine Learning Granting Scoring Model in Peer-to-Peer Lending. IEEE Access 8 (2020), 64873–64890. https://doi.org/10.1109/ACCESS.2020.2984412
  2. Risk-return modelling in the p2p lending market: Trends, gaps, recommendations and future directions. Electronic Commerce Research and Applications 49 (2021), 101079. https://doi.org/10.1016/j.elerap.2021.101079
  3. Prompted Opinion Summarization with GPT-3.5. arXiv preprint arXiv:2211.15914 (2022). https://doi.org/10.48550/arXiv.2211.15914
  4. Continual Lifelong Learning in Natural Language Processing: A Survey. In Proceedings of the 28th International Conference on Computational Linguistics, Donia Scott, Nuria Bel, and Chengqing Zong (Eds.). International Committee on Computational Linguistics, Barcelona, Spain (Online), 6523–6541. https://doi.org/10.18653/v1/2020.coling-main.574
  5. Language Models are Few-Shot Learners. https://doi.org/10.48550/ARXIV.2005.14165
  6. Spanish Pre-Trained BERT Model and Evaluation Data. In Practical ML for Developing Countries Workshop at ICLR 2020. https://doi.org/10.48550/arXiv.2308.02976
  7. Evaluating Large Language Models Trained on Code. arXiv preprint arXiv:2107.03374 (2021). https://doi.org/10.48550/arXiv.2107.03374
  8. Tianqi Chen and Carlos Guestrin. 2016. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (San Francisco, California, USA) (KDD ’16). Association for Computing Machinery, New York, NY, USA, 785–794. https://doi.org/10.1145/2939672.2939785
  9. Addressing Information Asymmetries in Online Peer-to-Peer Lending. Springer International Publishing, Cham, 15–31. https://doi.org/10.1007/978-3-030-02330-0_2
  10. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. (2018). https://doi.org/10.48550/ARXIV.1810.04805
  11. Topic Modeling in Embedding Spaces. Transactions of the Association for Computational Linguistics 8 (07 2020), 439–453. https://doi.org/10.1162/tacl_a_00325
  12. Description-text related soft information in peer-to-peer lending – Evidence from two leading European platforms. Journal of Banking & Finance 64 (2016), 169–187. https://doi.org/10.1016/j.jbankfin.2015.11.009
  13. Qiang Gao and Mingfeng Lin. 2015. Lemon or Cherry? The Value of Texts in Debt Crowdfunding. Technical Report 18. Center for Analytical Finance. University of California, Santa Cruz. https://cafin.ucsc.edu/research/work_papers/CAFIN_WP18.pdf
  14. Target-Dependent Sentiment Classification With BERT. IEEE Access 7 (2019), 154290–154299. https://doi.org/10.1109/ACCESS.2019.2946594
  15. Tell Me a Good Story and I May Lend You My Money: The Role of Narratives in Peer-to-Peer Lending Decisions. SSRN Electronic Journal (2011). https://doi.org/10.2139/ssrn.1840668
  16. John H. Holland. 1992. Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence. The MIT Press, Cambridge, Massachusetts, USA. https://doi.org/10.7551/mitpress/1090.001.0001
  17. Loan default prediction by combining soft information extracted from descriptive text in online peer-to-peer lending. Annals of Operations Research 266, 1–2 (Oct. 2017), 511–529. https://doi.org/10.1007/s10479-017-2668-z
  18. Johannes Kriebel and Lennart Stitz. 2022. Credit default prediction from user-generated text in peer-to-peer lending using deep learning. European Journal of Operational Research 302, 1 (Oct. 2022), 309–323. https://doi.org/10.1016/j.ejor.2021.12.024
  19. ALBERT: A Lite BERT for Self-supervised Learning of Language Representations. https://doi.org/10.48550/ARXIV.1909.11942
  20. BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics 36, 4 (09 2019), 1234–1240. https://doi.org/10.1093/bioinformatics/btz682
  21. Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv preprint arXiv:1910.13461 (2019). https://doi.org/10.48550/arXiv.1910.13461
  22. Network topology and systemic risk in Peer-to-Peer lending market. Physica A: Statistical Mechanics and its Applications 508 (2018), 118–130. https://doi.org/10.1016/j.physa.2018.05.083
  23. RoBERTa: A Robustly Optimized BERT Pretraining Approach. https://doi.org/10.48550/ARXIV.1907.11692
  24. Tim Loughran and Bill McDonald. 2011. When Is a Liability Not a Liability? Textual Analysis, Dictionaries, and 10-Ks. The Journal of Finance 66, 1 (2011), 35–65. https://doi.org/10.1111/j.1540-6261.2010.01625.x
  25. Scott M. Lundberg and Su-In Lee. 2017. A Unified Approach to Interpreting Model Predictions. In Proceedings of the 31st International Conference on Neural Information Processing Systems (Long Beach, California, USA) (NIPS’17). Curran Associates Inc., Red Hook, NY, USA, 4768–4777.
  26. CamemBERT: a Tasty French Language Model. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Dan Jurafsky, Joyce Chai, Natalie Schluter, and Joel Tetreault (Eds.). Association for Computational Linguistics, Online, 7203–7219. https://doi.org/10.18653/v1/2020.acl-main.645
  27. Jeremy Michels. 2012. Do Unverifiable Disclosures Matter? Evidence from Peer-to-Peer Lending. The Accounting Review 87, 4 (2012), 1385–1413.
  28. Efficient Estimation of Word Representations in Vector Space. https://doi.org/10.48550/ARXIV.1301.3781
  29. CORE-GPT: Combining Open Access research and large language models for credible, trustworthy question answering. arXiv preprint arXiv:2307.04683 (2023). https://doi.org/10.48550/arXiv.2307.04683
  30. Do Facial Images Matter? Understanding the Role of Private Information Disclosure in Crowdfunding Markets. Electronic Commerce Research and Applications 54, C (jul 2022), 14 pages. https://doi.org/10.1016/j.elerap.2022.101173
  31. Language Models are Unsupervised Multitask Learners. Technical Report. OpenAI. https://insightcivic.s3.us-east-1.amazonaws.com/language-models.pdf
  32. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. Journal of Machine Learning Research 21, 1, Article 140 (jan 2020), 67 pages.
  33. ROFIEG. 2019. Thirty recommendations on regulation, innovation and finance. Final Report to the European Commission by the Expert Group on Regulatory Obstacles to Financial Innovation. Technical Report. European Commission. https://ec.europa.eu/info/files/191113-report-expert-group-regulatory-obstacles-financial-innovation_en
  34. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. https://doi.org/10.48550/ARXIV.1910.01108
  35. Michael Siering. 2023. Peer-to-Peer (P2P) Lending Risk Management: Assessing Credit Risk on Social Lending Platforms Using Textual Factors. ACM Transactions on Management Information Systems 14, 3, Article 25 (jun 2023), 19 pages. https://doi.org/10.1145/3589003
  36. The value of text for small business default prediction: A Deep Learning approach. European Journal of Operational Research 295, 2 (Dec. 2021), 758–771. https://doi.org/10.1016/j.ejor.2021.03.008
  37. How to Fine-Tune BERT for Text Classification?. In Chinese Computational Linguistics, Maosong Sun, Xuanjing Huang, Heng Ji, Zhiyuan Liu, and Yang Liu (Eds.). Springer International Publishing, Cham, 194–206.
  38. Text Classification via Large Language Models. https://doi.org/10.48550/ARXIV.2305.08377
  39. Xu Sun and Weichao Xu. 2014. Fast Implementation of DeLong’s Algorithm for Comparing the Areas Under Correlated Receiver Operating Characteristic Curves. IEEE Signal Processing Letters 21, 11 (2014), 1389–1393. https://doi.org/10.1109/LSP.2014.2337313
  40. Vijay Srinivas Tida and Sonya Hy Hsu. 2022. Universal Spam Detection using Transfer Learning of BERT Model. In Proceedings of the 55th Hawaii International Conference on System Sciences. 7669–7677. http://hdl.handle.net/10125/80263
  41. Attention Is All You Need. (2017). https://doi.org/10.48550/ARXIV.1706.03762
  42. Credit Risk Evaluation Based on Text Analysis. International Journal of Cognitive Informatics and Natural Intelligence 10 (01 2016), 1–11. https://doi.org/10.4018/IJCINI.2016010101
  43. Predicting loan default in peer-to-peer lending using narrative data. Journal of Forecasting 39, 2 (2020), 260–280. https://doi.org/10.1002/for.2625
  44. Identifying features for detecting fraudulent loan requests on P2P platforms. In 2016 IEEE Conference on Intelligence and Security Informatics (ISI). 79–84. https://doi.org/10.1109/ISI.2016.7745447
  45. Peer-to-Peer Loan Fraud Detection: Constructing Features from Transaction Data. MIS Quarterly 45, 3 (Sept. 2022), 1777–1792. https://doi.org/10.25300/misq/2022/16103
  46. The relationship between soft information in loan titles and online peer-to-peer lending: evidence from RenRenDai platform. Electronic Commerce Research 19, 1 (2018), 111–129. https://doi.org/10.1007/s10660-018-9293-z
  47. Credit risk evaluation model with textual features from loan descriptions for P2P lending. Electronic Commerce Research and Applications 42 (2020), 100989. https://doi.org/10.1016/j.elerap.2020.100989
Citations (2)

Summary

We haven't generated a summary for this paper yet.