Designing Heterogeneous LLM Agents for Financial Sentiment Analysis

Published 11 Jan 2024 in cs.CL, cs.AI, cs.MA, and q-fin.GN | (2401.05799v1)

Abstract: LLMs have drastically changed the possible ways to design intelligent systems, shifting the focuses from massive data acquisition and new modeling training to human alignment and strategical elicitation of the full potential of existing pre-trained models. This paradigm shift, however, is not fully realized in financial sentiment analysis (FSA), due to the discriminative nature of this task and a lack of prescriptive knowledge of how to leverage generative models in such a context. This study investigates the effectiveness of the new paradigm, i.e., using LLMs without fine-tuning for FSA. Rooted in Minsky's theory of mind and emotions, a design framework with heterogeneous LLM agents is proposed. The framework instantiates specialized agents using prior domain knowledge of the types of FSA errors and reasons on the aggregated agent discussions. Comprehensive evaluation on FSA datasets show that the framework yields better accuracies, especially when the discussions are substantial. This study contributes to the design foundations and paves new avenues for LLMs-based FSA. Implications on business and management are also discussed.

Abstract PDF HTML Upgrade to Chat

Authors (1)

Frank Xing

References (46)

Citations (25)

View on Semantic Scholar

Summary

The paper presents a novel HAD framework employing heterogeneous LLM agents for financial sentiment analysis, achieving up to 9.46% accuracy and 13.72% F1-score improvements.
It utilizes specialized agents like mood, rhetoric, and aspect to address distinct FSA error types without the need for extensive fine-tuning.
The framework demonstrates practical benefits by reducing computational costs and improving sentiment interpretation for financial decision-making.

Designing Heterogeneous LLM Agents for Financial Sentiment Analysis

The paper investigates leveraging LLMs for Financial Sentiment Analysis (FSA), focusing on a framework that utilizes heterogeneous LLM agents without fine-tuning. This innovative approach aligns with Minsky's theory of mind and emotions, deploying specialized agents based on prior domain knowledge of FSA error types.

Introduction

The rapidly evolving capabilities of LLMs have transformed intelligent system design, emphasizing strategic utilization of pre-trained models over extensive data collection and model retraining. Despite breakthroughs in other areas, applying LLMs effectively to FSA has remained a challenge, largely due to the discriminative nature of sentiment analysis tasks. Traditional models, including BERT and its variants like FinBERT, rely heavily on fine-tuning, which can be computationally expensive and data-intensive. This research explores a new approach by proposing a framework of heterogeneous agents, each designed to address specific FSA cognitive challenges, thus improving sentiment classification through discussion and collective reasoning.

Heterogeneous Agent Discussion (HAD) Framework

The HAD framework is predicated on activating varied mental "resources" akin to human cognitive processes, as posited by Minsky's theory. These specialized agents are designed to focus on different error types commonly affecting FSA. The configuration comprises five agents: mood, rhetoric, dependency, aspect, and reference agents. Each is prompted to focus on specific details of FSA tasks, thereby simulating distinct cognitive functions that contribute to emotional state recognition and decision-making.

Figure 1: Different multi-agent LLM frameworks for reaching a consensus: (c) heterogeneous multi-agent discussion (HAD: the proposed framework)

Evaluation and Performance Metrics

The HAD framework was evaluated using datasets like Financial PhraseBank, StockSen, and FiQA. Importantly, the advanced LLMs tested include GPT-3.5 and BLOOMZ, revealing varying degrees of improvement across datasets. Empirical testing shows the HAD framework consistently enhances FSA accuracy and F1-scores, particularly notable with GPT-3.5, marking improvements of up to 9.46% in accuracy and 13.72% in F1-score. Such gains highlight the capability of HAD to address approximately 25%–35% of the performance discrepancies typically reliant on fine-tuning.

Agent Importance and Interaction

Agent-specific ablation studies provided insights into the relative significance of each agent. The mood, rhetoric, and aspect agents demonstrated the most substantial impact on overall FSA outcomes. The reference agent showed variable importance, while the dependency agent appeared redundant, indicating potential for further optimization in agent design.

Figure 2: An illustrative comparison between naive prompting (upper example) and the proposed HAD framework (lower example) with 3 heterogeneous agents inspired by FSA error types.

Case Studies and Real-world Implications

Through several detailed case studies, the applied HAD framework successfully navigates complex sentiment classifications often challenging for traditional paradigms. This enhancement in understanding nuances—such as irrealis mood and rhetorical devices—underlines the potential of HAD for robust financial decision-making. The operational intricacies demonstrate how the synthesized capabilities of diverse agents lead to superior interpretative outputs, useful for investors and financial analysts seeking to integrate sentiment analysis more effectively into decision-support systems.

Conclusion

The HAD framework represents a significant stride in utilizing LLMs for financial applications without the need for exhaustive retraining or fine-tuning. By strategically simulating cognitive resources through heterogeneous agents informed by domain-specific error types, the framework not only improves sentiment analysis accuracy but also sets a precedent for future research into multi-agent LLM configurations. Moving forward, the scalability and efficiency of the framework, particularly in high-stakes financial scenarios, remain key areas for development. Exploration into refining agent specialization and interaction mechanisms can yield even greater alignment between robust AI applications and real-world sentiment dynamics in finance.

Markdown Report Issue