FinTral: A Family of GPT-4 Level Multimodal Financial Large Language Models (2402.10986v3)

Published 16 Feb 2024 in cs.CL and cs.AI

Abstract: We introduce FinTral, a suite of state-of-the-art multimodal LLMs built upon the Mistral-7b model and tailored for financial analysis. FinTral integrates textual, numerical, tabular, and image data. We enhance FinTral with domain-specific pretraining, instruction fine-tuning, and RLAIF training by exploiting a large collection of textual and visual datasets we curate for this work. We also introduce an extensive benchmark featuring nine tasks and 25 datasets for evaluation, including hallucinations in the financial domain. Our FinTral model trained with direct preference optimization employing advanced Tools and Retrieval methods, dubbed FinTral-DPO-T&R, demonstrates an exceptional zero-shot performance. It outperforms ChatGPT-3.5 in all tasks and surpasses GPT-4 in five out of nine tasks, marking a significant advancement in AI-driven financial technology. We also demonstrate that FinTral has the potential to excel in real-time analysis and decision-making in diverse financial contexts. The GitHub repository for FinTral is available at \url{https://github.com/UBC-NLP/fintral}.

PDF Abstract

Introducing FinTral: Advancing Financial Analysis with Multimodal LLMs

Introduction

The world of financial analysis is intricate, requiring a nuanced understanding of both textual and numerical data. While the application of LLMs in this domain has shown promise, challenges such as dense jargon, rapid market changes, and the need for multimodal understanding persist. Acknowledging these hurdles, Bhatia et al. introduce FinTral, a state-of-the-art suite of multimodal LLMs designed to address the unique demands of financial analysis. Built on the Mistral-7b model, FinTral leverages a comprehensive approach incorporating domain-specific pretraining, fine-tuning, and innovative training methodologies to achieve exceptional analysis capabilities across a broad array of financial contexts.

Evaluation and Benchmarking

The evaluation of FinTral involved an extensive benchmark encompassing nine tasks across 25 datasets. Notably, FinTral exhibited superior performance, outdoing ChatGPT-3.5 across all tasks and surpassing GPT-4 in a majority. This achievement highlights FinTral's effectiveness in leveraging multimodal inputs for financial analysis, particularly in areas demanding robust textual and numerical reasoning.

Methodology

Bhatia et al. detail the development process of FinTral, starting from pretraining on a curated dataset of 20 billion tokens, tailored to capture the nuances of financial discourse. The model underwent fine-tuning with a domain-specific instruction set, aligning it closely with financial analysis tasks. Significantly, the inclusion of Direct Policy Optimization (DPO) and reinforcement learning further refined its capabilities, especially in real-time analysis contexts. Moreover, the introduction of multimodal functionalities and tools like CLIP for image understanding and specialized financial functions for numerical analysis underscored the comprehensive nature of FinTral's design.

Implications and Future Directions

The creation of FinTral paves the way for significant advancements in financial analysis. Its ability to process and interpret complex datasets, including a mix of textual, numerical, and visual data, represents a considerable leap forward in the capability of LLMs within the financial sector. The practical applications of such a model are vast, promising enhanced decision-making tools for professionals in finance. As for future research, the exploration into reduced-energy models and continual updates to align with market developments is crucial. Moreover, FinTral's approach to minimizing model hallucinations through specialized training and evaluation provides valuable insights into developing more accurate and reliable LLMs across various domains.

Conclusion

FinTral represents a significant stride in the integration of LLMs into the financial analysis field. Through a combination of advanced pretraining, fine-tuning techniques, and multimodal data integration, it sets new standards for performance and reliability in financial decision support tools. As the landscape of financial technology evolves, the continued development and refinement of models like FinTral will undoubtedly play a pivotal role in shaping the future of financial analysis and decision-making.