Cryptocurrency Portfolio Management with Deep Reinforcement Learning (1612.01277v5)

Published 5 Dec 2016 in cs.LG

Abstract: Portfolio management is the decision-making process of allocating an amount of fund into different financial investment products. Cryptocurrencies are electronic and decentralized alternatives to government-issued money, with Bitcoin as the best-known example of a cryptocurrency. This paper presents a model-less convolutional neural network with historic prices of a set of financial assets as its input, outputting portfolio weights of the set. The network is trained with 0.7 years' price data from a cryptocurrency exchange. The training is done in a reinforcement manner, maximizing the accumulative return, which is regarded as the reward function of the network. Backtest trading experiments with trading period of 30 minutes is conducted in the same market, achieving 10-fold returns in 1.8 months' periods. Some recently published portfolio selection strategies are also used to perform the same back-tests, whose results are compared with the neural network. The network is not limited to cryptocurrency, but can be applied to any other financial markets.

Citations (198)

View on Semantic Scholar

Summary

The paper presents a novel deep reinforcement learning approach using a model-less CNN to directly output optimal portfolio weights from historical cryptocurrency data.
It employs deterministic policy gradient methods with continuous action spaces, validated by backtests on 12 leading coins over multiple time periods.
The model achieved competitive performance with up to ten-fold returns in 1.8 months and lower maximum drawdowns, highlighting its potential for agile, risk-sensitive trading.

Deep Reinforcement Learning for Cryptocurrency Portfolio Management

The paper "Cryptocurrency Portfolio Management with Deep Reinforcement Learning" by Zhengyao Jiang and Jinjun Liang introduces a novel approach to financial portfolio management, particularly in the field of cryptocurrency trading. The authors leverage the capabilities of deep reinforcement learning by employing a model-less convolutional neural network (CNN) that takes historical price data as input to produce optimal portfolio weights. The proposed network diverges from traditional prediction-based financial models by directly outputting portfolio weight vectors without necessitating explicit market models or prior financial assumptions.

Methodology

Central to the researchers' approach is the implementation of deterministic policy gradient within a deep reinforcement learning framework. This allows the model to directly optimize the reward function—the accumulative returns—through continuous action space, as opposed to relying on discrete or probability-based predictions which are common in conventional RL applications. The model was trained using a training dataset comprising 0.7 years of price data from a cryptocurrency exchange, underscoring the emphasis on historical trends and price sequences to inform future portfolio allocation decisions.

Experimental Setup

The authors carried out comprehensive backtest experiments over three distinct time periods, simulating trades on a cryptocurrency exchange platform. Here, they used a set of 12 top-ranked coins by trading volume to minimize the risks associated with market liquidity and capital impact, validating the model's scalability and adaptability. The CNN's performance was evaluated against various benchmarks, including traditional models like Uniform Constant Rebalanced Portfolio and more recent algorithms like Online Newton Step.

Results

The CNN-based reinforcement learning model consistently demonstrated competitive performance outputs, specifically ten-fold returns within a timeframe of 1.8 months in certain backtests. It exceeded many benchmarks, including outperforming the Best Single Asset and Uniform Constant Rebalanced Portfolio strategies, although the Passive Aggressive Mean Reversion strategy marginally surpassed it in terms of absolute return. Critically, the CNN model emphasized a lower maximum drawdown, offering a prudent risk-return tradeoff compared to other examined strategies.

Implications and Future Directions

This research contributes to the convergence of quantitative finance and machine learning, offering a data-driven, flexible mode of financial analysis applicable beyond cryptocurrency to other market instruments. By circumventing model-based assumptions, the proposed architecture exhibits potential for agile adaptation to evolving financial contexts. Future work may explore the inclusion of additional market features, extending training datasets, or integrating online training algorithms to ameliorate expiration issues observed during extended backtest intervals.

Through such enhancements and broader empirical validations, the deterministic policy gradient approach could significantly redefine asset management paradigms, particularly in volatile and emergent financial sectors like cryptocurrencies. This underscores the potential evolution of AI-based portfolio optimization, potentially establishing new directions for both academic exploration and practical application in automated trading systems.

PDF Markdown