FinDABench: Benchmarking Financial Data Analysis Ability of Large Language Models (2401.02982v4)

Published 1 Jan 2024 in cs.CL and cs.AI

Abstract: LLMs have demonstrated impressive capabilities across a wide range of tasks. However, their proficiency and reliability in the specialized domain of financial data analysis, particularly focusing on data-driven thinking, remain uncertain. To bridge this gap, we introduce \texttt{FinDABench}, a comprehensive benchmark designed to evaluate the financial data analysis capabilities of LLMs within this context. \texttt{FinDABench} assesses LLMs across three dimensions: 1) \textbf{Foundational Ability}, evaluating the models' ability to perform financial numerical calculation and corporate sentiment risk assessment; 2) \textbf{Reasoning Ability}, determining the models' ability to quickly comprehend textual information and analyze abnormal financial reports; and 3) \textbf{Technical Skill}, examining the models' use of technical knowledge to address real-world data analysis challenges involving analysis generation and charts visualization from multiple perspectives. We will release \texttt{FinDABench}, and the evaluation scripts at \url{https://github.com/cubenlp/BIBench}. \texttt{FinDABench} aims to provide a measure for in-depth analysis of LLM abilities and foster the advancement of LLMs in the field of financial data analysis.

PDF HTML Abstract

Summarize PDF Markdown Bookmark Chat (Pro)

References (46)

Authors (10)

Shu Liu (146 papers)
Shangqing Zhao (14 papers)
Chenghao Jia (4 papers)
Xinlin Zhuang (6 papers)
Zhaoguang Long (2 papers)
Man Lan (26 papers)
Qingquan Wu (1 paper)
Chong Yang (7 papers)
Aimin Zhou (43 papers)
Jie Zhou (687 papers)

Citations (2)

View on Semantic Scholar

GitHub

GitHub - cubenlp/BIBench: BIBench：数据分析领域LLM评测基准 (18 stars)

FinDABench: Benchmarking Financial Data Analysis Ability of Large Language Models (2401.02982v4)

Related Papers

GitHub