Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

From Numbers to Words: Multi-Modal Bankruptcy Prediction Using the ECL Dataset (2401.12652v1)

Published 23 Jan 2024 in cs.CE and q-fin.CP

Abstract: In this paper, we present ECL, a novel multi-modal dataset containing the textual and numerical data from corporate 10K filings and associated binary bankruptcy labels. Furthermore, we develop and critically evaluate several classical and neural bankruptcy prediction models using this dataset. Our findings suggest that the information contained in each data modality is complementary for bankruptcy prediction. We also see that the binary bankruptcy prediction target does not enable our models to distinguish next year bankruptcy from an unhealthy financial situation resulting in bankruptcy in later years. Finally, we explore the use of LLMs in the context of our task. We show how GPT-based models can be used to extract meaningful summaries from the textual data but zero-shot bankruptcy prediction results are poor. All resources required to access and update the dataset or replicate our experiments are available on github.com/henriarnoUG/ECL.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (19)
  1. Next-year bankruptcy prediction from textual data: Benchmark and baselines. In Proceedings of the Fourth Workshop on Financial Technology and Natural Language Processing (FinNLP), pages 187–195, Abu Dhabi, United Arab Emirates (Hybrid). Association for Computational Linguistics.
  2. William H Beaver. 1966. Financial ratios as predictors of failure. Journal of accounting research, pages 71–111.
  3. Ben S Bernanke. 1981. Bankruptcy, liquidity, and recession. The American Economic Review, 71(2):155–159.
  4. Making words work: Using financial text as a predictor of financial events. Decision Support Systems, 50(1):164–175.
  5. Tianqi Chen and Carlos Guestrin. 2016. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pages 785–794.
  6. Jesse Davis and Mark Goadrich. 2006. The relationship between precision-recall and roc curves. In Proceedings of the 23rd international conference on Machine Learning, pages 233–240.
  7. Learning from imbalanced data sets, volume 10. Springer.
  8. News summarization and evaluation in the era of gpt-3. arXiv preprint arXiv:2209.12356.
  9. Myoung-Jong Kim and Dae-Ki Kang. 2010. Ensemble with neural networks for bankruptcy prediction. Expert Systems with Applications, 37(4):3373–3379.
  10. EDGAR-CORPUS: Billions of tokens make the world go round. In Proceedings of the Third Workshop on Economics and Natural Language Processing, pages 13–18, Punta Cana, Dominican Republic. Association for Computational Linguistics.
  11. Breaking the bank with chatgpt: Few-shot text classification for finance. arXiv preprint arXiv:2308.14634.
  12. Deep learning models for bankruptcy prediction using textual disclosures. European journal of operational research, 274(2):743–758.
  13. Introduction to Information Retrieval. Cambridge University Press, USA.
  14. Md&a disclosure and the firm’s ability to continue as a going concern. The Accounting Review, 90(4):1621–1651.
  15. Marcus D Odom and Ramesh Sharda. 1990. A neural network model for bankruptcy prediction. In 1990 IJCNN International Joint Conference on neural networks, pages 163–168. IEEE.
  16. James A Ohlson. 1980. Financial ratios and the probabilistic prediction of bankruptcy. Journal of accounting research, pages 109–131.
  17. Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35:27730–27744.
  18. David H Wolpert. 1992. Stacked generalization. Neural networks, 5(2):241–259.
  19. A robustly optimized BERT pre-training approach with post-training. In Proceedings of the 20th Chinese National Conference on Computational Linguistics, pages 1218–1227, Huhhot, China. Chinese Information Processing Society of China.
Citations (1)

Summary

We haven't generated a summary for this paper yet.