Combining Machine Learning Classifiers for Stock Trading with Effective Feature Extraction (2107.13148v3)
Abstract: The unpredictability and volatility of the stock market render it challenging to make a substantial profit using any generalised scheme. Many previous studies tried different techniques to build a machine learning model, which can make a significant profit in the US stock market by performing live trading. However, very few studies have focused on the importance of finding the best features for a particular trading period. Our top approach used the performance to narrow down the features from a total of 148 to about 30. Furthermore, the top 25 features were dynamically selected before each time training our machine learning model. It uses ensemble learning with four classifiers: Gaussian Naive Bayes, Decision Tree, Logistic Regression with L1 regularization, and Stochastic Gradient Descent, to decide whether to go long or short on a particular stock. Our best model performed daily trade between July 2011 and January 2019, generating 54.35% profit. Finally, our work showcased that mixtures of weighted classifiers perform better than any individual predictor of making trading decisions in the stock market.
- Burton G Malkiel and Eugene F Fama “Efficient capital markets: A review of theory and empirical work” In The journal of Finance 25.2 Wiley Online Library, 1970, pp. 383–417 DOI: 10.2307/2325486
- S Rasoul Safavian and David Landgrebe “A survey of decision tree classifier methodology” In IEEE transactions on systems, man, and cybernetics 21.3 IEEE, 1991, pp. 660–674 DOI: 10.1109/21.97458
- Bruce I Jacobs and Kenneth N Levy “Long/short equity investing” In Journal of Portfolio Management 20.1 INSTITUTIONAL INVESTOR INC 488 MADISON AVENUE, NEW YORK, NY 10022, 1993, pp. 52 URL: https://jlem.com/documents/FG/jlem/articles/580182_LongShortEquityInvesting.pdf
- “Stock market trend prediction using ARIMA-based neural networks” In Proceedings of International Conference on Neural Networks (ICNN’96) 4, 1996, pp. 2160–2165 IEEE DOI: 10.1109/ICNN.1996.549236
- Vladimir N Vapnik “An overview of statistical learning theory” In IEEE transactions on neural networks 10.5 Citeseer, 1999, pp. 988–999 DOI: 10.1109/72.788640
- Ajith Abraham, Baikunth Nath and Prabhat Kumar Mahanti “Hybrid intelligent systems for stock market analysis” In International Conference on Computational Science, 2001, pp. 337–345 Springer DOI: 10.1007/3-540-45718-6˙38
- “Classification and regression by randomForest” In R news 2.3, 2002, pp. 18–22 URL: https://www.researchgate.net/profile/Andy_Liaw/publication/228451484_Classification_and_Regression_by_RandomForest/links/53fb24cc0cf20a45497047ab/Classification-and-Regression-by-RandomForest.pdf
- An-Sing Chen, Mark T Leung and Hazem Daouk “Application of neural networks to an emerging financial market: forecasting and trading the Taiwan Stock Index” In Computers & Operations Research 30.6 Elsevier, 2003, pp. 901–923 DOI: 10.1016/S0305-0548(02)00037-0
- Kyoung-jae Kim “Financial time series forecasting using support vector machines” In Neurocomputing 55.1-2 Elsevier, 2003, pp. 307–319 DOI: 10.1016/S0925-2312(03)00372-2
- “Are loss functions all the same?” In Neural Computation 16.5 MIT Press, 2004, pp. 1063–1076 DOI: 10.1162/089976604773135104
- “Input dimension reduction for load forecasting based on support vector machines” In 2004 IEEE International Conference on Electric Utility Deregulation, Restructuring and Power Technologies. Proceedings 2, 2004, pp. 510–514 IEEE DOI: 10.1109/DRPT.2004.1338036
- Wei Huang, Yoshiteru Nakamori and Shou-Yang Wang “Forecasting stock market movement direction with support vector machine” In Computers & operations research 32.10 Elsevier, 2005, pp. 2513–2522 DOI: 10.1016/j.cor.2004.03.016
- Vikramaditya Jakkula “Tutorial on support vector machine (svm)” In School of EECS, Washington State University 37, 2006 URL: https://course.ccs.neu.edu/cs5100f11/resources/jakkula.pdf
- Md Rafiul Hassan, Baikunth Nath and Michael Kirley “A fusion model of HMM, ANN and GA for stock market forecasting” In Expert systems with Applications 33.1 Elsevier, 2007, pp. 171–180 DOI: 10.1016/j.eswa.2006.04.007
- “A hybrid SOFM-SVR with a filter-based feature selection for stock market forecasting” In Expert Systems with applications 36.2 Elsevier, 2009, pp. 1529–1539 DOI: 10.1016/j.eswa.2007.11.062
- “Prediction of stock market index movement by ten data mining techniques” In Modern Applied Science 3.12, 2009, pp. 28–42 URL: https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.906.2882&rep=rep1&type=pdf
- Léon Bottou “Large-scale machine learning with stochastic gradient descent” In Proceedings of COMPSTAT’2010 Springer, 2010, pp. 177–186 DOI: 10.1007/978-3-7908-2604-3˙16
- “Combining multiple feature selection methods for stock prediction: Union, intersection, and multi-intersection approaches” In Decision Support Systems 50.1 Elsevier, 2010, pp. 258–269 DOI: 10.1016/j.dss.2010.08.028
- Jeremy Grant “High-frequency boom time hits slowdown” In Financial Times 12, 2011
- “Predicting stock returns by classifier ensembles” In Applied Soft Computing 11.2 Elsevier, 2011, pp. 2452–2459 DOI: 10.1016/j.asoc.2010.10.001
- Tianxin Dai, Arpan Shah and Hongxia Zhong “Automated Stock Trading Using Machine Learning Algorithms”, 2012 URL: https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.278.5891&rep=rep1&type=pdf
- R O’Reilly “High Frequency Trading: Are Our Vital Capital Markets at Risk from a Rampant Form of Trading that Ignored Business Fundamentals?” In The Analyst, 2012
- “Financial distress prediction using support vector machines: Ensemble vs. individual” In Applied Soft Computing 12.8 Elsevier, 2012, pp. 2254–2265 DOI: 10.1016/j.asoc.2012.03.028
- Yuqinq He, Kamaladdin Fataliyev and Lipo Wang “Feature selection for stock market analysis” In International conference on neural information processing, 2013, pp. 737–744 Springer DOI: 10.1007/978-3-642-42042-9˙91
- Vladimir Vapnik “The nature of statistical learning theory” Springer science & business media, 2013 URL: http://bit.ly/statisticalLearningTheory
- Osman Hegazy, Omar S Soliman and Mustafa Abdul Salam “A machine learning model for stock market prediction” In arXiv preprint arXiv:1402.7351, 2014 URL: https://arxiv.org/ftp/arxiv/papers/1402/1402.7351.pdf
- “Understanding machine learning: From theory to algorithms” Cambridge university press, 2014 URL: https://books.google.com.bd/books/about/Understanding_Machine_Learning.html
- “A causal feature selection algorithm for stock prediction modeling” In Neurocomputing 142 Elsevier, 2014, pp. 48–59 DOI: 10.1016/j.neucom.2014.01.057
- Sasan Barak, Jalil Heidary Dahooie and Tomáš Tichỳ “Wrapper ANFIS-ICA method to do stock market timing and feature selection on the basis of Japanese Candlestick” In Expert Systems with Applications 42.23 Elsevier, 2015, pp. 9221–9235 DOI: 10.1016/j.eswa.2015.08.010
- “Trading in markets with noisy information: an evolutionary analysis” In Connection Science 27.3 Taylor & Francis, 2015, pp. 253–268 DOI: 10.1080/09540091.2015.1039492
- “Predicting stock price direction using support vector machines” In Independent work report spring, 2015 URL: https://www.cs.princeton.edu/sites/default/files/uploads/saahil_madge.pdf
- “Predicting stock and stock price index movement using trend deterministic data preparation and machine learning techniques” In Expert Systems with Applications 42.1 Elsevier, 2015, pp. 259–268 DOI: 10.1016/j.eswa.2014.07.040
- Leslie CO Tiong, David CL Ngo and Yunli Lee “Forex prediction engine: framework, modelling techniques and implementations” In International Journal of Computational Science and Engineering 13.4 Inderscience Publishers (IEL), 2016, pp. 364–377
- Guanting Chen, Yatong Chen and Takahiro Fushimi “Application of Deep Learning to Algorithmic Trading”, 2017 URL: http://cs229.stanford.edu/proj2017/final-reports/5241098.pdf
- “Using AI to Make Predictions on Stock Market”, 2017 URL: http://cs229.stanford.edu/proj2017/final-reports/5212256.pdf
- Kofi O Nti, Adebayo Adekoya and Benjamin Weyori “Random forest based feature selection of macroeconomic variables for stock market prediction” In American Journal of Applied Sciences 16.7, 2019, pp. 200–212 DOI: 10.3844/ajassp.2019.200.212
- Qingqing Chang “The sentiments of open financial information, public mood and stock returns: an empirical study on Chinese growth enterprise market” In International Journal of Computational Science and Engineering 23.2 Inderscience Publishers (IEL), 2020, pp. 103–114
- “Dependence structure between bitcoin price and its influence factors” In International Journal of Computational Science and Engineering 21.3 Inderscience Publishers (IEL), 2020, pp. 334–345
- “Integrated long-term stock selection models based on feature selection and machine learning algorithms for China stock market” In IEEE Access 8 IEEE, 2020, pp. 22672–22685 DOI: 10.1109/ACCESS.2020.2969293
- Dattatray P Gandhmal and K Kumar “Wrapper-Enabled Feature Selection and CPLM-Based NARX Model for Stock Market Prediction” In The Computer Journal 64.2 Oxford University Press, 2021, pp. 169–184 DOI: 10.1093/comjnl/bxaa099
- Jincheng Hu “Local-constraint transformer network for stock movement prediction” In International Journal of Computational Science and Engineering 24.4 Inderscience Publishers (IEL), 2021, pp. 429–437
- “S_I_LSTM: stock price prediction based on multiple data sources and sentiment analysis” In Connection Science 0.0 Taylor & Francis, 2021, pp. 1–19 DOI: 10.1080/09540091.2021.1940101