Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
60 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

DISC-FinLLM: A Chinese Financial Large Language Model based on Multiple Experts Fine-tuning (2310.15205v2)

Published 23 Oct 2023 in cs.CL
DISC-FinLLM: A Chinese Financial Large Language Model based on Multiple Experts Fine-tuning

Abstract: We propose Multiple Experts Fine-tuning Framework to build a financial LLM, DISC-FinLLM. Our methodology improves general LLMs by endowing them with multi-turn question answering abilities, domain text processing capabilities, mathematical computation skills, and retrieval-enhanced generation capabilities. We build a financial instruction-tuning dataset named DISC-FIN-SFT, including instruction samples of four categories (consulting, NLP tasks, computing and retrieval-augmented generation). Evaluations conducted on multiple benchmarks demonstrate that our model performs better than baseline models in various financial scenarios. Further resources can be found at https://github.com/FudanDISC/DISC-FinLLM.

DISC-FinLLM: A Specialized Chinese Financial LLM

The paper "DISC-FinLLM: A Chinese Financial LLM based on Multiple Experts Fine-tuning" presents a novel methodology for developing a financial LLM tailored to the Chinese market. Recognizing the unique challenges inherent in the financial domain, the authors propose a Multiple Experts Fine-tuning Framework to enhance general LLM capabilities with domain-specific skills.

Methodology and Dataset Construction

The research introduces DISC-FinLLM, a financial LLM that embeds capabilities such as multi-turn question answering, domain-specific text processing, mathematical computation, and retrieval-augmented generation. Central to its development is the creation of DISC-FIN-SFT, a financial instruction-tuning dataset. This dataset is meticulously segmented into four categories: financial consulting, various NLP tasks, computational problems, and retrieval-enhanced generation.

  • Financial Consulting Instructions: These are derived from financial Q&A datasets, forums, and ChatGPT-generated content. The goal is to mimic real-world consulting scenarios and enhance the model's conversational fluency.
  • Financial Task Instructions: This category focuses on tasks such as sentiment analysis and event extraction, using existing Chinese financial NLP datasets complemented by hand-crafted prompt templates for zero-shot and few-shot learning.
  • Financial Computing Instructions: Addressing numerical computation within financial texts, this dataset includes questions that train the model to utilize computational tools like calculators and equation solvers.
  • Retrieval-enhanced Instructions: By merging generated questions with retrieved documents, this part of the dataset simulates the model's engagement with financial research and news texts.

Multiple Experts Fine-tuning Framework

The innovative Multiple Experts Fine-tuning Framework trains the model on these datasets using Low-rank Adaptation (LoRA). This involves creating distinct LoRA modules for each dataset category, enabling DISC-FinLLM to toggle between specialized tasks without compromising performance. This modular approach not only improves efficiency but also isolates task-specific enhancements, optimizing the model for different financial scenarios.

Evaluation and Results

The model's performance is evaluated against several benchmarks:

  • Financial NLP Tasks: Using the FinCUGE benchmark, DISC-FinLLM exhibits significant performance improvements across six evaluated tasks compared to baseline models. This underscores the effectiveness of the task-specific datasets.
  • Human Tests: On the FinEval benchmark, which includes financial multiple-choice questions, the model demonstrates superior accuracy compared to other LLMs, only trailing behind GPT-4 and ChatGPT.
  • Computation and Retrieval-Efficiency: The model excels in computationally intensive tasks and retrieval-based evaluations, highlighting the adept integration of computational and retrieval plug-ins, reflecting its adaptability in financial contexts.

Implications and Future Directions

The DISC-FinLLM marks a substantial step forward in creating specialized LLMs for financial applications in non-English markets. The model enhances capabilities crucial for financial professionals, offering potential applications in customer support, investment advisory, and financial analysis. The modular architecture suggests promising avenues for further customization and extension into other specialized domains within AI.

Future research may focus on expanding the dataset's scope and integrating real-time financial data, which would enhance the model's applicability in dynamic financial environments. Additionally, exploring cross-lingual adaptations could broaden the utility of such domain-specific LLMs globally. The rigorous approach taken in this paper sets a foundation for further exploration of expert-based fine-tuning methods in specialized fields.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (11)
  1. Wei Chen (1288 papers)
  2. Qiushi Wang (10 papers)
  3. Zefei Long (1 paper)
  4. Xianyin Zhang (2 papers)
  5. Zhongtian Lu (1 paper)
  6. Bingxuan Li (19 papers)
  7. Siyuan Wang (73 papers)
  8. Jiarong Xu (24 papers)
  9. Xiang Bai (221 papers)
  10. Xuanjing Huang (287 papers)
  11. Zhongyu Wei (98 papers)
Citations (35)