Papers
Topics
Authors
Recent
2000 character limit reached

DeepFolio: Deep Learning for Portfolios

Updated 15 December 2025
  • DeepFolio is a deep learning framework that optimizes portfolio allocations by integrating high-frequency, sentiment, and relational data.
  • It utilizes advanced neural architectures, including CNNs, GRUs, LSTMs, and Graph Attention Networks to capture complex market dynamics.
  • The end-to-end approach enhances risk-adjusted returns by directly targeting portfolio-level objectives such as Sharpe ratio and volatility reduction.

DeepFolio refers to a class of deep learning-based frameworks for portfolio management that prioritize direct, end-to-end optimization of portfolio weights, integrating multi-modal financial signals and leveraging advanced neural architectures tailored to the structure of asset prediction problems. There are two archetypal DeepFolio methodologies: one centered on high-frequency limit order book data with deep convolutional and recurrent networks (Sangadiev et al., 2020), and one that unifies temporal, relational, and sentiment signals through LSTM, GAT, and NLP pipelines to produce daily allocations for diverse asset portfolios (Lin et al., 29 Sep 2025).

1. Motivation and Problem Formulation

Traditional portfolio construction is constrained by multi-step pipelines—first predicting asset returns, then applying mean–variance optimization (MVO)—which can introduce instability and fail to exploit non-linear, high-dimensional market dynamics. Early machine learning approaches, including PCA, LDA, and SVM, require hand-crafted features and lack capacity for deep temporal pattern extraction. DeepLOB (CNN+LSTM) approaches increased modeling fidelity but suffered from training instability, parameter sensitivity, and scalability constraints (Sangadiev et al., 2020). DeepFolio architectures are motivated by the need to jointly optimize portfolio allocations by learning directly from price, order book, relational, and textual data with robust, expressive, and end-to-end differentiable models. This enables the direct maximization of portfolio-level objectives, such as the Sharpe ratio or minimum volatility, in both high-frequency and daily asset management contexts (Lin et al., 29 Sep 2025).

2. Data Preprocessing and Signal Engineering

For LOB-centric DeepFolio, inputs are constructed from normalized, multi-level snapshots of bid-ask prices and volumes, using per-asset dynamic zz-score normalization (rolling window over the previous five days), and target labels are defined by price-movement classes computed from kk-step smoothed return operators. In equity/news-oriented DeepFolio, each asset is represented by a rolling window of price features (returns, log-volume, momentum, volatility) concatenated with per-day sentiment vectors extracted from raw news headlines via transformer (e.g., FinBERT) or pretrained word embedding plus Bi-LSTM pipelines. The sentiment representations are aggregated per-asset per-day and appended to the price signals in the input tensor (Lin et al., 29 Sep 2025).

3. Architectural Design

3.1 Limit Order Book DeepFolio

The architecture replaces the DeepLOB CNN+LSTM stack with a deeper configuration:

  • Convolutional head with four 2D convolutional layers (LeakyReLU) for LOB feature extraction.
  • Residual blocks (three successive Conv2D+LeakyReLU with skip connections, no batch normalization).
  • Inception-v2 Module for multi-scale temporal patterning (parallel 1×1, 3×3, 3×1/1×3, maxpooling branches).
  • Recurrent stage using a single-layer GRU (H=64H=64 units), motivated by empirical superiority to LSTM in low-data regimes.
  • Read-out softmax layer classifies price-movement trend for each asset and prediction horizon.

3.2 Price–Sentiment–Graph DeepFolio

The model comprises:

  • Shared LSTM (across NN assets), inputting rr-day stacked price and sentiment features per asset, outputting ht(i)h_t^{(i)} for each asset ii.
  • Graph Attention Network (GAT) taking hidden states [ht(1),...,ht(N)][h_t^{(1)},...,h_t^{(N)}] and asset-relational adjacency AtA_t (static or dynamic, sector/correlation/sentiment-based), outputting refined embeddings zt(i)z_t^{(i)}.
  • MLP head (linear+tanh) transforms zt(i)z_t^{(i)} into raw asset scores Wt(i)W_t^{(i)}.
  • Portfolio weight normalization (wt,iw_{t,i}) by softmax (long-only) or signed-sum (permits shorting), enforcing ∑iwt,i=1\sum_i w_{t,i}=1 (Lin et al., 29 Sep 2025).

4. Mathematical Formulation and Training Procedure

4.1 Loss Functions and Constraints

Both LOB and equity/news DeepFolio instantiate direct portfolio-level loss functions:

  • Sharpe ratio surrogate: LSR=−Et[Rp,t+1]Vart[Rp,t+1]+ϵ+λ∑i=1N∣wt,i∣L_{\text{SR}} = -\frac{\mathbb{E}_t[R_{p,t+1}]}{\sqrt{\mathrm{Var}_t[R_{p,t+1}]+\epsilon}} + \lambda\sum_{i=1}^N|w_{t,i}| with Rp,t+1=wt⊤rt+1R_{p,t+1}=w_t^\top r_{t+1}.
  • Volatility objective: LV=std(R)L_V = \mathrm{std}(R) over the holding period (Sangadiev et al., 2020).
  • Portfolio constraints: enforced by normalization (fully invested, long-only vs. with shorting), optionally regularized with additional â„“1\ell_1 or â„“2\ell_2 penalties.

4.2 Training Regime

  • Universal use of Adam optimizer.
  • Mini-batch or rolling training over date sequences and assets.
  • For LOB DeepFolio: batch size 64, learning rate 0.01 for price module/0.001 for allocation module, early stopping, Glorot initialization, orthogonal recurrence.
  • For price–sentiment–graph DeepFolio: batch sizes ∈{32,64}\in\{32,64\}, lookback r=30r=30, LSTM/GAT hidden size ∈{32,64,96}\in\{32,64,96\}, dropout ≈0.1\approx 0.1–0.3, learning rate [10−4,5×10−3][10^{-4}, 5\times10^{-3}], Optuna HPO, early stopping on Sharpe, training epochs up to 40 (Lin et al., 29 Sep 2025).

5. Graph Attention and Sentiment Integration

5.1 Stock-graph Construction

  • Static adjacency: corr(log-returnsi,log-returnsj)\text{corr}(\text{log-returns}_i,\text{log-returns}_j) over training period.
  • Dynamic graph: Weekly update, Aij(t)=1A_{ij}(t)=1 if i,j share sector or ∣corr5d(returns)∣>Ï„|\text{corr}_{5d}(\text{returns})| > \tau or ∣corr5d(sentiment)∣>Ï„|\text{corr}_{5d}(\text{sentiment})| > \tau (with Ï„=0.5\tau=0.5).

5.2 GAT Layer Mechanism

  • Linear transformation WhiW h_i, then edge-wise attention eij=LeakyReLU(a⊤[h^i∥h^j])e_{ij} = \text{LeakyReLU}(a^\top [\hat{h}_i \parallel \hat{h}_j]), coefficients αij\alpha_{ij} via softmax, feature aggregation zi=σ(∑j∈N(i)αijh^j)z_i = \sigma( \sum_{j\in N(i)} \alpha_{ij} \hat{h}_j), multi-head possible via parallel (W, a) and concatenation (Lin et al., 29 Sep 2025).

5.3 Sentiment Pipeline

  • Headlines fetched and embedded via GloVe+Bi-LSTM or Transformer (FinBERT), aggregated per asset by mean or attention pooling to produce ut(i)∈RSu_t^{(i)} \in \mathbb{R}^S, concatenated into day-level feature vectors, enabling early fusion of price and market-psychology signals (Lin et al., 29 Sep 2025).

6. Empirical Evaluation and Results

6.1 LOB DeepFolio (FI-2010, crypto-assets)

  • Outperforms DeepLOB and all classical baselines on price-movement prediction (test set k=1k=1: Accuracy 82.44%, F1 81.29%; rising to 79.51%, 79.22% at k=10k=10).
  • On crypto, DeepFolio exceeds DeepLOB, particularly in transfer to unseen assets.
  • Portfolio allocations learned under LSRL_{\text{SR}} achieve highest compounded return and Sharpe among all methods (test period: DeepFolio SR, Exp Return 1.46793; Mean Ret 0.04013; Std Dev 0.00756; Sharpe 0.05307).

6.2 Price–Sentiment–Graph DeepFolio (U.S. equities)

  • Evaluated on nine large-cap stocks (2021–2025), compared to equal-weight and CAPM-based MVO.
  • On held-out test (2024–2025): model v3 (static graph, price+sentiment) achieves cumulative return 37.3% (vs. 24.7% equal-weight, 22.0% CAPM), annualized Sharpe 1.15 (vs. 0.83, 0.84), lower VaR and max drawdown.
  • Ablations: v1 (price-only) already outperforms benchmarks. v3 (+sentiment) attains maximum Sharpe and lowest VaR, dynamic graph (v4) reduces volatility/drawdown, PCA-reduction (v5) minimizes max drawdown under stress (Lin et al., 29 Sep 2025).

7. Analysis and Implications

  • Residual and inception modules improve learning of deep temporal dependencies and mitigate parameter initialization pathologies encountered in earlier architectures.
  • GRU units display faster convergence and stronger generalization in limited-data settings compared to LSTMs, rationalizing their architectural choice in high-frequency streams (Sangadiev et al., 2020).
  • The GAT module enables the model to dynamically encode sectoral and statistical dependencies, adjusting for shifting correlation and sentiment regimes.
  • Directly integrating sentiment vectors and allowing early fusion at the LSTM level substantially increases the ability to model market psychology.
  • Using predicted trend labels, rather than realized returns, as inputs for end-to-end allocation training with portfolio-level objectives enhances both return and risk metrics over classical Markowitz/naive methods, particularly in high-volatility settings (Sangadiev et al., 2020, Lin et al., 29 Sep 2025).

A plausible implication is that the DeepFolio paradigm—integrating temporal, relational, and unstructured textual information in a single allocation pipeline—represents an advance over classical and segmented deep learning approaches in both daily and high-frequency domains, with demonstrated generalization to new assets and robustness in volatile markets. The joint end-to-end training for portfolio-level risk/reward objectives is crucial to outperformance, and the architectural principles (residual connections, graph attention, early NLP fusion) underpin increased expressivity and stability. For future work, scaling these frameworks to larger, more heterogeneous asset universes while incorporating explicit trading constraints and transaction costs remains a primary research avenue (Lin et al., 29 Sep 2025, Sangadiev et al., 2020).

Whiteboard

Follow Topic

Get notified by email when new papers are published related to DeepFolio.