Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
144 tokens/sec
GPT-4o
8 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

DiffsFormer: A Diffusion Transformer on Stock Factor Augmentation (2402.06656v1)

Published 5 Feb 2024 in q-fin.ST, cs.AI, and cs.LG

Abstract: Machine learning models have demonstrated remarkable efficacy and efficiency in a wide range of stock forecasting tasks. However, the inherent challenges of data scarcity, including low signal-to-noise ratio (SNR) and data homogeneity, pose significant obstacles to accurate forecasting. To address this issue, we propose a novel approach that utilizes artificial intelligence-generated samples (AIGS) to enhance the training procedures. In our work, we introduce the Diffusion Model to generate stock factors with Transformer architecture (DiffsFormer). DiffsFormer is initially trained on a large-scale source domain, incorporating conditional guidance so as to capture global joint distribution. When presented with a specific downstream task, we employ DiffsFormer to augment the training procedure by editing existing samples. This editing step allows us to control the strength of the editing process, determining the extent to which the generated data deviates from the target domain. To evaluate the effectiveness of DiffsFormer augmented training, we conduct experiments on the CSI300 and CSI800 datasets, employing eight commonly used machine learning models. The proposed method achieves relative improvements of 7.2% and 27.8% in annualized return ratio for the respective datasets. Furthermore, we perform extensive experiments to gain insights into the functionality of DiffsFormer and its constituent components, elucidating how they address the challenges of data scarcity and enhance the overall model performance. Our research demonstrates the efficacy of leveraging AIGS and the DiffsFormer architecture to mitigate data scarcity in stock forecasting tasks.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (50)
  1. J. Zou, Q. Zhao, Y. Jiao, H. Cao, Y. Liu, Q. Yan, E. Abbasnejad, L. Liu, and J. Q. Shi, “Stock market prediction via deep learning techniques: A survey,” arXiv preprint arXiv:2212.12717, 2022.
  2. L. Zhang, C. C. Aggarwal, and G. Qi, “Stock price prediction via discovering multi-frequency trading patterns,” in KDD.   ACM, 2017, pp. 2141–2149.
  3. F. Feng, H. Chen, X. He, J. Ding, M. Sun, and T. Chua, “Enhancing stock movement prediction with adversarial training,” in IJCAI.   ijcai.org, 2019, pp. 5843–5849.
  4. W. Xu, W. Liu, L. Wang, Y. Xia, J. Bian, J. Yin, and T. Liu, “HIST: A graph-based framework for stock trend forecasting via mining concept-oriented shared information,” CoRR, vol. abs/2110.13716, 2021.
  5. A. Nichol, P. Dhariwal, A. Ramesh, P. Shyam, P. Mishkin, B. McGrew, I. Sutskever, and M. Chen, “Glide: Towards photorealistic image generation and editing with text-guided diffusion models,” arXiv preprint arXiv:2112.10741, 2021.
  6. A. Ramesh, P. Dhariwal, A. Nichol, C. Chu, and M. Chen, “Hierarchical text-conditional image generation with clip latents,” arXiv preprint arXiv:2204.06125, 2022.
  7. Y. Tashiro, J. Song, Y. Song, and S. Ermon, “Csdi: Conditional score-based diffusion models for probabilistic time series imputation,” Advances in Neural Information Processing Systems, vol. 34, pp. 24 804–24 816, 2021.
  8. N. Chen, Y. Zhang, H. Zen, R. J. Weiss, M. Norouzi, and W. Chan, “Wavegrad: Estimating gradients for waveform generation,” arXiv preprint arXiv:2009.00713, 2020.
  9. J. Ho, A. Jain, and P. Abbeel, “Denoising diffusion probabilistic models,” Advances in Neural Information Processing Systems, vol. 33, pp. 6840–6851, 2020.
  10. W. Peebles and S. Xie, “Scalable diffusion models with transformers,” arXiv preprint arXiv:2212.09748, 2022.
  11. J. Ho and T. Salimans, “Classifier-free diffusion guidance,” arXiv preprint arXiv:2207.12598, 2022.
  12. H. Jacobs and S. Müller, “Anomalies across the globe: Once public, no longer existent?” Journal of Financial Economics, vol. 135, no. 1, pp. 213–230, 2020.
  13. S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural Comput., vol. 9, no. 8, pp. 1735–1780, 1997.
  14. J. Chung, Ç. Gülçehre, K. Cho, and Y. Bengio, “Empirical evaluation of gated recurrent neural networks on sequence modeling,” CoRR, vol. abs/1412.3555, 2014.
  15. A. Graves and J. Schmidhuber, “Framewise phoneme classification with bidirectional LSTM and other neural network architectures,” Neural Networks, vol. 18, no. 5-6, pp. 602–610, 2005.
  16. H. Li, Y. Shen, and Y. Zhu, “Stock price prediction using attention-based multi-input LSTM,” in ACML, ser. Proceedings of Machine Learning Research, vol. 95.   PMLR, 2018, pp. 454–469.
  17. G. Ding and L. Qin, “Study on the prediction of stock price based on the associated network model of LSTM,” Int. J. Mach. Learn. Cybern., vol. 11, no. 6, pp. 1307–1317, 2020.
  18. A. M. Rather, A. Agarwal, and V. N. Sastry, “Recurrent neural network and a hybrid model for prediction of stock returns,” Expert Syst. Appl., vol. 42, no. 6, pp. 3234–3241, 2015.
  19. D. Chen, Y. Zou, K. Harimoto, R. Bao, X. Ren, and X. Sun, “Incorporating fine-grained events in stock movement prediction,” CoRR, vol. abs/1910.05078, 2019.
  20. S. Deng, N. Zhang, W. Zhang, J. Chen, J. Z. Pan, and H. Chen, “Knowledge-driven stock trend prediction and explanation via temporal convolutional network,” in WWW (Companion Volume).   ACM, 2019, pp. 678–685.
  21. W. Lu, J. Li, J. Wang, and L. Qin, “A cnn-bilstm-am method for stock price prediction,” Neural Comput. Appl., vol. 33, no. 10, pp. 4741–4753, 2021.
  22. S. K. Chandar, “Convolutional neural network for stock trading using technical indicators,” Autom. Softw. Eng., vol. 29, no. 1, p. 16, 2022.
  23. P. Velickovic, G. Cucurull, A. Casanova, A. Romero, P. Liò, and Y. Bengio, “Graph attention networks,” in ICLR (Poster), 2018.
  24. C. Xu, H. Huang, X. Ying, J. Gao, Z. Li, P. Zhang, J. Xiao, J. Zhang, and J. Luo, “HGNN: hierarchical graph neural network for predicting the classification of price-limit-hitting stocks,” Inf. Sci., vol. 607, pp. 783–798, 2022.
  25. W. Li, R. Bao, K. Harimoto, D. Chen, J. Xu, and Q. Su, “Modeling the stock relation with graph network for overnight stock movement prediction,” in IJCAI.   ijcai.org, 2020, pp. 4541–4547.
  26. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, “Attention is all you need,” in NIPS, 2017, pp. 5998–6008.
  27. Q. Ding, S. Wu, H. Sun, J. Guo, and J. Guo, “Hierarchical multi-scale gaussian transformer for stock movement prediction,” in IJCAI.   ijcai.org, 2020, pp. 4640–4646.
  28. J. Yoo, Y. Soun, Y. Park, and U. Kang, “Accurate multivariate stock movement prediction via data-axis transformer with multi-level contexts,” in KDD.   ACM, 2021, pp. 2037–2045.
  29. L. Yang, T. L. J. Ng, B. Smyth, and R. Dong, “HTML: hierarchical transformer-based multi-task learning for volatility prediction,” in WWW.   ACM / IW3C2, 2020, pp. 441–451.
  30. K. Rasul, C. Seward, I. Schuster, and R. Vollgraf, “Autoregressive denoising diffusion models for multivariate probabilistic time series forecasting,” in ICML, ser. Proceedings of Machine Learning Research, vol. 139.   PMLR, 2021, pp. 8857–8868.
  31. T. Yan, H. Zhang, T. Zhou, Y. Zhan, and Y. Xia, “Scoregrad: Multivariate probabilistic time series forecasting with continuous energy-based generative models,” CoRR, vol. abs/2106.10121, 2021.
  32. Y. Li, X. Lu, Y. Wang, and D. Dou, “Generative time series forecasting with diffusion, denoise, and disentanglement,” in NeurIPS, 2022.
  33. H. Wen, Y. Lin, Y. Xia, H. Wan, R. Zimmermann, and Y. Liang, “Diffstg: Probabilistic spatio-temporal graph forecasting with denoising diffusion models,” CoRR, vol. abs/2301.13629, 2023.
  34. L. Lin, Z. Li, R. Li, X. Li, and J. Gao, “Diffusion models for time series applications: A survey,” CoRR, vol. abs/2305.00624, 2023.
  35. W. Wang, Y. Xu, F. Feng, X. Lin, X. He, and T.-S. Chua, “Diffusion recommender model,” arXiv preprint arXiv:2304.04971, 2023.
  36. J. Song, C. Meng, and S. Ermon, “Denoising diffusion implicit models,” arXiv preprint arXiv:2010.02502, 2020.
  37. R. He, S. Sun, X. Yu, C. Xue, W. Zhang, P. H. S. Torr, S. Bai, and X. Qi, “Is synthetic data from generative models ready for image recognition?” in ICLR.   OpenReview.net, 2023.
  38. R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer, “High-resolution image synthesis with latent diffusion models,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 10 684–10 695.
  39. Y. Song, J. Sohl-Dickstein, D. P. Kingma, A. Kumar, S. Ermon, and B. Poole, “Score-based generative modeling through stochastic differential equations,” arXiv preprint arXiv:2011.13456, 2020.
  40. P. Dhariwal and A. Nichol, “Diffusion models beat gans on image synthesis,” Advances in Neural Information Processing Systems, vol. 34, pp. 8780–8794, 2021.
  41. O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” in MICCAI (3), vol. 9351, 2015, pp. 234–241.
  42. E. Perez, F. Strub, H. de Vries, V. Dumoulin, and A. C. Courville, “Film: Visual reasoning with a general conditioning layer,” in AAAI.   AAAI Press, 2018, pp. 3942–3951.
  43. A. Brock, J. Donahue, and K. Simonyan, “Large scale GAN training for high fidelity natural image synthesis,” in ICLR, 2019.
  44. P. Goyal, P. Dollár, R. B. Girshick, P. Noordhuis, L. Wesolowski, A. Kyrola, A. Tulloch, Y. Jia, and K. He, “Accurate, large minibatch SGD: training imagenet in 1 hour,” CoRR, vol. abs/1706.02677, 2017.
  45. X. Yang, W. Liu, D. Zhou, J. Bian, and T. Liu, “Qlib: An ai-oriented quantitative investment platform,” CoRR, vol. abs/2009.11189, 2020.
  46. M. Wang, M. Zhang, J. Guo, and W. Jia, “MTMD: multi-scale temporal memory learning and efficient debiasing framework for stock trend forecasting,” CoRR, vol. abs/2212.08656, 2022.
  47. H. Lin, D. Zhou, W. Liu, and J. Bian, “Learning multiple stock trading patterns with temporal routing adaptor and optimal transport,” in KDD.   ACM, 2021, pp. 1017–1026.
  48. Z. Li, D. Yang, L. Zhao, J. Bian, T. Qin, and T. Liu, “Individualized indicator for all: Stock-wise technical indicator optimization with stock embedding,” in KDD.   ACM, 2019, pp. 894–902.
  49. J. Shipard, A. Wiliem, K. N. Thanh, W. Xiang, and C. Fookes, “Diversity is definitely needed: Improving model-agnostic zero-shot classification via stable diffusion,” in Computer Vision and Pattern Recognition Workshop on Generative Models for Computer Vision, 2023.
  50. X. Gastaldi, “Shake-shake regularization,” CoRR, vol. abs/1705.07485, 2017.
Citations (4)

Summary

We haven't generated a summary for this paper yet.