Fin-R1: A Large Language Model for Financial Reasoning through Reinforcement Learning (2503.16252v2)

Published 20 Mar 2025 in cs.CL

Abstract: Reasoning LLMs are rapidly evolving across various domains. However, their capabilities in handling complex financial tasks still require in-depth exploration. In this paper, we introduce Fin-R1, a reasoning LLM specifically designed for the financial sector. Fin-R1 is built using a two-stage architecture, leveraging a financial reasoning dataset distilled and processed based on DeepSeek-R1. Through supervised fine-tuning (SFT) and reinforcement learning (RL) training, it demonstrates performance close to DeepSeek-R1 with a parameter size of 7 billion across a range of financial reasoning tasks. It achieves the state-of-the-art (SOTA) in the FinQA and ConvFinQA tasks between those LLMs in our evaluation, surpassing larger models in other tasks as well. Fin-R1 showcases strong reasoning and decision-making capabilities, providing solutions to various problems encountered in the financial domain. Our code is available at https://github.com/SUFE-AIFLM-Lab/Fin-R1.

Authors (16)

Zhaowei Liu (19 papers)
Xin Guo (139 papers)
Fangqi Lou (2 papers)
Lingfeng Zeng (1 paper)
Jinyi Niu (2 papers)
Zixuan Wang (83 papers)
Jiajie Xu (27 papers)
Weige Cai (2 papers)
Ziwei Yang (23 papers)
Xueqian Zhao (2 papers)
Chao Li (429 papers)
Sheng Xu (106 papers)
Dezhi Chen (3 papers)
Yun Chen (134 papers)
Zuo Bai (1 paper)
Liwen Zhang (34 papers)

Summary

Overview of Fin-R1: A LLM for Financial Reasoning through Reinforcement Learning

The research paper presents Fin-R1, a financial domain-specific LLM specifically crafted for complex financial reasoning. The authors address critical challenges faced by general-purpose reasoning models in financial applications: fragmented financial data, uncontrollable reasoning logic, and weak business generalization abilities. The core innovation lies in the delicate integration of supervised fine-tuning (SFT) and reinforcement learning (RL) within a strategically curated dataset.

Methodological Framework

The approach to constructing Fin-R1 is organized into a two-stage framework. First, Fin-R1-Data, a high-quality dataset comprising 60,091 entries, is crafted for financial reasoning scenarios through distillation and screening processes. This dataset integrates diverse financial data sources and proprietary datasets, ensuring comprehensive coverage of core financial business scenarios.

The methodological framework encompasses data distillation using a refined version of DeepSeek-R1 and quality filtering through Qwen2.5-72B-Instruct. Subsequently, Fin-R1 undergoes training through SFT and RL. The latter incorporates Group Relative Policy Optimization (GRPO), implementing a dual reward mechanism to enhance both format correctness and content accuracy.

Key Results and Contributions

With a lightweight architecture of 7 billion parameters, Fin-R1 achieves notable performance across various benchmarks, significantly outperforming several larger models—including DeepSeek-R1-Distill-Llama-70B—demonstrating its efficiency at a significantly reduced computational cost. Specifically, it secures an average score of 75.2 across benchmarks and achieves state-of-the-art scores of 85.0 in ConvFinQA and 76.0 in FinQA, underscoring its rigorous numerical reasoning capabilities.

Implications for Financial AI Applications

The practical implications of Fin-R1 extend beyond its optimized performance metrics. In real-world applications, Fin-R1's enhanced automated reasoning and decision-making abilities offer efficient solutions to long-standing financial industry challenges, such as financial compliance and robo-advisory. It exemplifies the potential for specialized LLMs to generate interpretable decision-making logic consistent with regulatory requirements.

Additionally, the research highlights the model's cross-task generalization abilities, offering substantial improvements in diverse financial applications beyond its specific training on FinQA and ConvFinQA datasets. This indicates one direction for future development and refinement, focusing on broadening the model's applicability and effectiveness in various financial domains.

Future Research Directions

Looking ahead, Fin-R1's success directs attention towards further advancements in the integration and innovation within the fintech field. This expansion involves refining multimodal architectures to enhance application exploration in cutting-edge areas, thereby promoting intelligent growth in the financial industry. Furthermore, evolving LLMs in finance will enhance their integration with practical applications, facilitating improved risk management and regulatory compliance.

In conclusion, Fin-R1 represents a significant stride toward specialized LLMs serving domain-specific applications. The strategic exploitation of two-stage training architectures and specialized datasets demonstrates substantial potential for AI-driven solutions in finance, paving the way for more comprehensive future developments in this intricate domain.