MacroHFT: Memory Augmented Context-aware Reinforcement Learning On High Frequency Trading (2406.14537v1)

Published 20 Jun 2024 in cs.LG and q-fin.TR

Abstract: High-frequency trading (HFT) that executes algorithmic trading in short time scales, has recently occupied the majority of cryptocurrency market. Besides traditional quantitative trading methods, reinforcement learning (RL) has become another appealing approach for HFT due to its terrific ability of handling high-dimensional financial data and solving sophisticated sequential decision-making problems, \emph{e.g.,} hierarchical reinforcement learning (HRL) has shown its promising performance on second-level HFT by training a router to select only one sub-agent from the agent pool to execute the current transaction. However, existing RL methods for HFT still have some defects: 1) standard RL-based trading agents suffer from the overfitting issue, preventing them from making effective policy adjustments based on financial context; 2) due to the rapid changes in market conditions, investment decisions made by an individual agent are usually one-sided and highly biased, which might lead to significant loss in extreme markets. To tackle these problems, we propose a novel Memory Augmented Context-aware Reinforcement learning method On HFT, \emph{a.k.a.} MacroHFT, which consists of two training phases: 1) we first train multiple types of sub-agents with the market data decomposed according to various financial indicators, specifically market trend and volatility, where each agent owns a conditional adapter to adjust its trading policy according to market conditions; 2) then we train a hyper-agent to mix the decisions from these sub-agents and output a consistently profitable meta-policy to handle rapid market fluctuations, equipped with a memory mechanism to enhance the capability of decision-making. Extensive experiments on various cryptocurrency markets demonstrate that MacroHFT can achieve state-of-the-art performance on minute-level trading tasks.

Citations (1)

View on Semantic Scholar

Summary

The paper presents MacroHFT, which leverages a two-phase training procedure with market decomposition and context-aware RL to mitigate overfitting in volatile markets.
It integrates a hyper-agent with memory augmentation to consolidate sub-agent decisions, achieving superior performance metrics such as a 39.28% total return on ETH versus baselines.
The methodology’s efficacy in enhancing risk-adjusted returns and managing volatility offers promising implications for high-frequency trading and broader RL applications.

MacroHFT: Memory Augmented Context-aware Reinforcement Learning On High-Frequency Trading

Overview

The paper entitled "MacroHFT: Memory Augmented Context-aware Reinforcement Learning On High Frequency Trading" introduces a novel framework aimed at addressing the challenges faced by High-Frequency Trading (HFT) in cryptocurrency markets. These challenges include overfitting, rapid market changes, and highly biased investment decisions. The proposed approach, MacroHFT, incorporates Reinforcement Learning (RL) techniques with an emphasis on market context-awareness and memory augmentation to generate consistently profitable trading strategies.

Methodology

The authors propose a two-phase training procedure that leverages both market decomposition and context-aware RL:

Market Decomposition and Sub-Agent Training:
- The market data is decomposed into segments based on trends and volatility indicators. This results in six categories corresponding to different market states, specifically bull, medium, bear markets, and volatile, medium, stable states.
- For each market category, a sub-agent is trained utilizing Double Deep Q-Network (DDQN) with a dueling network architecture. Each sub-agent is equipped with conditional adapters to adjust trading policies based on current market conditions and the agent’s position, thereby preventing overfitting.
Meta-Policy and Hyper-Agent Training:
- A hyper-agent is trained to integrate the decisions made by the sub-agents, resulting in a meta-policy that adapts to rapid market fluctuations and aims for consistent profitability.
- The hyper-agent utilizes a memory mechanism to store and retrieve relevant experiences, enhancing decision-making capabilities during extreme market changes.

Numerical Results and Claims

The authors conduct extensive experiments on four cryptocurrency markets (BTC, ETH, DOT, LTC) and benchmark the proposed method against eight state-of-the-art baseline approaches, including PPO, DDQN, and rule-based methods like MACD and Imbalance Volume.

Key performance metrics include Total Return (TR), Annual Sharpe Ratio (ASR), Annual Calmar Ratio (ACR), Annual Sortino Ratio (ASoR), Annual Volatility (AVOL), and Maximum Drawdown (MDD). Across all datasets, MacroHFT exhibited superior performance:

Profitability: The proposed method achieved higher total returns compared to all baselines, e.g., for ETH it attained a TR of 39.28%, while the closest competitor (EarnHFT) had an 18.02% return.
Risk-Adjusted Returns: MacroHFT reported higher ASR, ACR, and ASoR values, indicating improved returns per unit risk, highlighting its robustness in volatile markets.
Risk Management: The method displayed competitive or superior performance in managing maximum drawdowns and annual volatility.

Implications and Future Work

The practical implications of MacroHFT are significant for algorithmic trading within high-frequency domains, particularly in the unpredictable and highly volatile cryptocurrency markets. The approach is particularly adept at mitigating risks associated with overfitting, a common issue in RL-based trading algorithms. The memory augmentation component further ensures resilience against extreme market fluctuations, making the trading strategy more robust.

Theoretically, the adaptability and context-awareness embedded in MacroHFT could open new pathways for research into RL applications in other domains beyond finance, such as autonomous systems and real-time decision-making tasks. Future research could explore the integration of more complex contextual information and the application of advanced neural network architectures to further enhance the adaptability and performance of RL agents in dynamic environments.

Conclusion

The MacroHFT framework represents a significant advancement in the field of high-frequency trading by addressing critical challenges such as overfitting and biased decision-making through the utilization of memory augmented, context-aware RL. The superior performance metrics across multiple cryptocurrency markets underline its efficacy and potential for broad application in financial trading systems. This research sets a foundation for future explorations and improvements in the intersection of RL and financial technology.

For further inquiry and reproducibility, the authors have released the codebase at GitHub repository, offering the research community an opportunity to build upon and validate their findings.

PDF Markdown

Related Papers

Tweets

https://twitter.com/BitBiblio/status/1884228497008136527