HARLF: Hierarchical RL Meets Sentiment for Smarter Portfolios

This presentation explores HARLF, a novel three-tier hierarchical reinforcement learning framework that integrates lightweight sentiment analysis from news data with traditional market indicators to optimize financial portfolios. The talk examines how the architecture's layered decision-making structure achieves superior returns and stability compared to conventional approaches, delivering a 26% annualized return with a Sharpe ratio of 1.2 while revealing both the promise and practical limitations of combining structured and unstructured data in algorithmic trading.
Script
Most portfolio optimization treats market data and news sentiment as separate worlds. The authors of this paper built a three-tier hierarchy that lets reinforcement learning agents at each level specialize, collaborate, and ultimately make better trading decisions than either signal could achieve alone.
The framework starts with base agents that digest two kinds of information: traditional market indicators like volatility and price momentum, alongside sentiment scores extracted from news headlines using a compact language model. Each base agent learns to read this hybrid signal and propose initial allocation strategies.
At the second tier, meta-agents specialize further. One aggregates decisions from quantitative base agents, another from sentiment-driven agents. This division lets the system weigh numerical stability against narrative signals, adapting to whether the market is reacting to fundamentals or breaking news.
The super-agent sits at the top, synthesizing meta-agent recommendations into a final portfolio allocation. Trained end-to-end with the entire hierarchy, it learns when to trust quantitative signals, when sentiment matters more, and how to balance both for maximum return.
The results are striking: 26% annualized return and a Sharpe ratio of 1.2, significantly outperforming both traditional mean-variance optimization and single-tier reinforcement learning baselines. The hierarchy's ability to separate and recombine signals turns out to be a structural advantage.
Of course, the framework assumes synchronized data and ignores transaction costs, and it hasn't been stress-tested under extreme market crashes. But the core insight holds: teaching agents to specialize at different scales and then collaborate produces portfolios that are both aggressive and resilient. If you want to explore how hierarchical learning reshapes financial AI, visit EmergentMind.com to dive deeper and create your own video walkthrough.