Mean Absolute Directional Loss (MADL)

Updated 23 July 2025

Mean Absolute Directional Loss (MADL) is a loss function that integrates the absolute magnitude of returns with directional alignment to evaluate forecasting errors in financial models.
It enhances trading strategies by penalizing incorrect signal direction, yielding superior risk-adjusted returns compared to traditional loss measures like MSE and MAE.
Extensions such as GMADL provide a differentiable alternative that improves optimization stability and reduces transaction costs in algorithmic investment applications.

Mean Absolute Directional Loss (MADL) is a loss function tailored to machine learning models for applications requiring sensitivity to both the magnitude and directionality of predictions, with a particular focus on financial time series forecasting and algorithmic investment strategies. Unlike standard loss functions (such as MSE or MAE), MADL directly incorporates the sign alignment between predicted and realized returns, coupling forecast evaluation with expected profit-and-loss outcomes. Empirical analyses demonstrate MADL’s practical benefits in optimizing trading strategies, yielding superior risk-adjusted returns and more effective signal generation compared to conventional error metrics (Michańków et al., 2023, Michańków et al., 22 Jul 2025). Related risk frameworks, such as bi-directional dispersion and T-risk, further generalize MADL’s approach to loss tail sensitivity and robust optimization (Holland, 2022).

1. Definition and Mathematical Formulation

MADL quantifies a model’s forecasting error by combining the absolute magnitude of realized returns with the directional agreement between prediction and observation. Formally, for $N$ prediction-actual pairs $(\hat{R}_i, R_i)$ , MADL is defined as:

$\textrm{MADL} = \frac{1}{N} \sum_{i=1}^{N} \left(-1 \times \textrm{sign}(R_i \times \hat{R}_i) \times |R_i| \right)$

where:

$R_i$ is the observed return at time $i$ ,
$\hat{R}_i$ is the predicted return at time $i$ ,
$\textrm{sign}(x)$ is the sign function ( $+1$ , $0$, or $-1$ ),
$|R_i|$ is the absolute value of the realized return.

This formulation benchmarks the predicted return against zero, ensuring that the loss function directly penalizes misaligned predictions (predicted and realized returns of opposite sign) and rewards correct directional forecasts. Negative values of MADL indicate “profitable” and directionally correct model predictions, while positive values signal losses due to incorrect directional calls (Michańków et al., 2023, Michańków et al., 22 Jul 2025).

2. Conceptual Motivation and Relationship to T-risk Frameworks

Traditional loss functions such as MSE, MAE, and RMSE evaluate magnitude of prediction error, disregarding its direction. However, financial decision-making often demands alignment between predicted and realized return signs, as this alignment governs the success of trading decisions. MADL addresses this by integrating directional accuracy.

MADL can be situated as a specific instance within the broader class of threshold risks (“T-risks”), which employ symmetric dispersion functions to penalize deviations from a threshold in both directions (Holland, 2022). The general T-risk formulation is:

$R_{\rho}(h;\theta, \eta) = \eta \cdot \theta + \mathbb{E}\left[\rho(\ell(h) - \theta)\right]$

where $\rho(\cdot)$ is a symmetric (even) function whose growth on both sides of zero is controlled via a shape parameter. This approach provides bi-directional dispersion: penalties are assigned for both upside and downside deviations, tuning sensitivity to gain or loss extremes. By adopting a dispersion function appropriate to the MADL context (piecewise linear and sign-dependent), the T-risk framework generalizes MADL and enables smooth trade-offs between tail sensitivity and central tendency.

3. Practical Application in Algorithmic Investment Strategies

MADL has been principally developed for and empirically validated within machine learning pipelines for quantitative trading strategy development. In such workflows:

Forecasts of instrument returns are produced using models such as LSTM or Transformer neural networks.
MADL is used as both the training loss and hyperparameter tuning criterion.
The resulting model directly optimizes for the profitability and quality of trading signals, minimizing the expected loss as measured by MADL.

Typical use involves constructing buy/sell signals based on the sign of forecasts relative to zero. MADL’s focus on directionally correct decisions ensures that the trained models prefer forecasts yielding higher economic utility, as opposed to mere numeric accuracy.

Empirical studies on assets including Bitcoin, crude oil, and equity indices indicate that models trained and selected under MADL (and evaluated on MADL-consistent validation sets) achieve higher annualized return, lower maximum drawdown, and materially better modified information ratios ( $IR^*$ , $IR^{**}$ ) than those trained using standard MAE or MSE losses (Michańków et al., 2023, Michańków et al., 22 Jul 2025).

4. Comparative Performance and Empirical Results

In practical experiments comparing Transformer and LSTM models across equity and cryptocurrency datasets, Transformer models trained with MADL consistently outperform their LSTM counterparts in return and risk-adjusted metrics. For example, on Bitcoin return forecasting, a MADL-trained LSTM achieved an annualized return compounded (ARC) of 109.94, compared to 44.26 for a standard MAE-based approach (Michańków et al., 2023). Transformer models further improved upon these results with more stable equity curves and lower volatility and drawdowns (Michańków et al., 22 Jul 2025).

The use of MADL encourages models to avoid overfitting to noise, as directional correctness tends to be more robust to non-stationarity in financial data than mere point-wise fit. Empirical validation demonstrates superior performance over buy-and-hold and traditional metric-optimized strategies, confirming the utility of MADL in walk-forward schemes and real-world out-of-sample evaluation.

5. Extensions and Generalizations: GMADL

While MADL offers clear interpretive and operational advantages, it remains non-differentiable (due to absolute value and sign function composition), potentially complicating gradient-based optimization in high-frequency or large-scale settings.

To address this, the Generalized Mean Absolute Directional Loss (GMADL) introduces a differentiable sigmoid-based proxy for the sign function and an exponent parameter for the magnitude term:

$\textrm{GMADL} = \frac{1}{N} \sum_{i=1}^N \left[ - \left( \frac{1}{1 + \exp(-a (R_i \hat{R}_i))} - 0.5 \right) \cdot |R_i|^b \right]$

where parameter $a$ tunes sensitivity around zero and $b$ allows for emphasizing or downweighting extreme returns (Michańków et al., 24 Dec 2024).

GMADL enhances numerical stability, enables smooth optimization, and allows explicit tuning for transaction-cost-aware strategies by adjusting the frequency and economic impact of generated trading signals. Empirical results show that GMADL outperforms both MADL and MSE-type losses in risk-weighted return while reducing transaction frequency, addressing practical concerns of overfitting and cost-effectiveness (Michańków et al., 24 Dec 2024).

6. Limitations and Future Research Directions

MADL, by construction, does not explicitly penalize the magnitude of losses beyond their absolute value; extremely large errors are not squared or otherwise upweighted, which may understate risk in volatile regimes. The literature proposes squared or otherwise modified versions as possible enhancements (Michańków et al., 2023). Furthermore, MADL and GMADL’s ability to capture the trade-off between signal frequency, transaction costs, and economic outcome is subject to model and market context; current versions do not natively integrate explicit transaction cost or slippage models (Michańków et al., 24 Dec 2024).

Future research directions include:

Integration with more nuanced cost models,
Real-time parameter adaptation based on regime detection,
Coupling with additional risk measures (such as Sharpe ratio maximization) for holistic trading system optimization,
Application in non-financial directional prediction domains.

Loss Function	Core Formula	Directionality Sensitive	Differentiable
MSE	$\frac{1}{N}\sum(\hat{y}_i-y_i)^2$	No	Yes
MAE	$\frac{1}{N}\sum\|\hat{y}_i-y_i\|$	No	Yes
MADL	$\frac{1}{N}\sum -\text{sign}(R_i \hat{R}_i)\,\|R_i\|$	Yes	No
GMADL	$\frac{1}{N}\sum -\left(\frac{1}{1+e^{-aR_i \hat{R}_i}}-0.5\right)\|R_i\|^b$	Yes	Yes

MADL and its generalizations represent a paradigm shift in loss function design for trading-oriented machine learning. Their association of model optimization with economically meaningful outcomes distinguishes them from traditional metrics and supports the development of more profitable and robust algorithmic strategies.