Detection of financial opportunities in micro-blogging data with a stacked classification system (2404.07224v1)
Abstract: Micro-blogging sources such as the Twitter social network provide valuable real-time data for market prediction models. Investors' opinions in this network follow the fluctuations of the stock markets and often include educated speculations on market opportunities that may have impact on the actions of other investors. In view of this, we propose a novel system to detect positive predictions in tweets, a type of financial emotions which we term "opportunities" that are akin to "anticipation" in Plutchik's theory. Specifically, we seek a high detection precision to present a financial operator a substantial amount of such tweets while differentiating them from the rest of financial emotions in our system. We achieve it with a three-layer stacked Machine Learning classification system with sophisticated features that result from applying Natural Language Processing techniques to extract valuable linguistic information. Experimental results on a dataset that has been manually annotated with financial emotion and ticker occurrence tags demonstrate that our system yields satisfactory and competitive performance in financial opportunity detection, with precision values up to 83%. This promising outcome endorses the usability of our system to support investors' decision making.
- T. Li, J. van Dalen, and P. J. van Rees, “More than just Noise? Examining the Information Content of Stock Microblogs on Financial Markets,” Journal of Information Technology, vol. 33, no. 1, pp. 50–69, mar 2018.
- X. Li, P. Wu, and W. Wang, “Incorporating stock prices and news sentiments for stock market prediction: A case of Hong Kong,” Information Processing & Management, p. 102212, feb 2020.
- G. Bello-Orgaz, R. M. Mesas, C. Zarco, V. Rodriguez, O. Cordón, and D. Camacho, “Marketing analysis of wineries using social collective behavior from users’ temporal activity on Twitter,” Information Processing & Management, p. 102220, feb 2020.
- M. Meire, M. Ballings, and D. Van den Poel, “The added value of social media data in B2B customer acquisition systems: A real-life experiment,” Decision Support Systems, vol. 104, pp. 26–37, dec 2017.
- P.-F. Pai and C.-H. Liu, “Predicting Vehicle Sales by Sentiment Analysis of Twitter Data and Stock Market Values,” IEEE Access, vol. 6, pp. 57 655–57 662, 2018.
- H. Yuan, W. Xu, Q. Li, and R. Lau, “Topic sentiment mining for sales performance prediction in e-commerce,” Annals of Operations Research, vol. 270, no. 1-2, pp. 553–576, nov 2018.
- F. Mai, Z. Shan, Q. Bai, X. S. Wang, and R. H. Chiang, “How Does Social Media Impact Bitcoin Value? A Test of the Silent Majority Hypothesis,” Journal of Management Information Systems, vol. 35, no. 1, pp. 19–52, jan 2018.
- Y. Sun, X. Liu, G. Chen, Y. Hao, and Z. J. Zhang, “How mood affects the stock market: Empirical evidence from microblogs,” Information & Management, vol. 57, no. 5, pp. 103–181, jul 2020.
- D. Enke and S. Thawornwong, “The Use of Data Mining and Neural Networks for Forecasting Stock Market Returns,” Expert Systems with Applications, vol. 29, no. 4, pp. 927–940, nov 2005.
- M. S. Gerber, “Predicting Crime Using Twitter and Kernel Density Estimation,” Decision Support Systems, vol. 61, pp. 115–125, 2014.
- A. G. Reece, A. J. Reagan, K. L. M. Lix, P. S. Dodds, C. M. Danforth, and E. J. Langer, “Forecasting the onset and course of mental illness with Twitter data,” Scientific Reports, vol. 7, no. 1, pp. 1–11, dec 2017.
- K. Zahra, M. Imran, and F. O. Ostermann, “Automatic identification of eyewitness messages on Twitter during disasters,” Information Processing & Management, vol. 57, no. 1, pp. 102–107, 2020.
- N. Oliveira, P. Cortez, and N. Areal, “The Impact of Microblogging Data for Stock Market Prediction: Using Twitter to Predict Returns, Volatility, Trading Volume and Survey Sentiment Indices,” Expert Systems with Applications, vol. 73, pp. 125–144, 2017.
- M. Nofer and O. Hinz, “Using Twitter to Predict the Stock Market,” Business & Information Systems Engineering, vol. 57, no. 4, pp. 229–242, 2015.
- T. Dimpfl and S. Jank, “Can Internet Search Queries Help to Predict Stock Market Volatility?” European Financial Management, vol. 22, no. 2, pp. 171–192, 2016.
- X. Zhong and D. Enke, “Forecasting daily stock market return using dimensionality reduction,” Expert Systems with Applications, vol. 67, pp. 126–139, jan 2017.
- X. Zhang, Y. Zhang, S. Wang, Y. Yao, B. Fang, and P. S. Yu, “Improving stock market prediction via heterogeneous information fusion,” Knowledge-Based Systems, vol. 143, pp. 236–247, mar 2018.
- E. Hoseinzade and S. Haratizadeh, “CNNpred: CNN-based stock market prediction using a diverse set of variables,” Expert Systems with Applications, vol. 129, pp. 273–285, sep 2019.
- F. Sun, A. Belatreche, S. Coleman, T. M. McGinnity, and Y. Li, “Pre-processing Online Financial Text for Sentiment Classification: A Natural Language Processing Approach,” in Proceedings of the IEEE Conference on Computational Intelligence for Financial Engineering & Economics. IEEE, 2014, pp. 122–129.
- I. E. Fisher, M. R. Garnsey, and M. E. Hughes, “Natural Language Processing in Accounting, Auditing and Finance: A Synthesis of the Literature with a Roadmap for Future Research,” Intelligent Systems in Accounting, Finance and Management, vol. 23, no. 3, pp. 157–214, 2016.
- F. Z. Xing, E. Cambria, and R. E. Welsch, “Natural Language Based Financial Forecasting: a Survey,” Artificial Intelligence Review, vol. 50, no. 1, pp. 49–73, 2018.
- K. K. Singh and P. Dimri, “Score Based Financial Forecasting Method by Incorporating Different Sources of Information Flow into Integrative River Model,” in Proceedings of the 6th International Conference-Cloud System and Big Data Engineering. IEEE, 2016, pp. 685–688.
- N. I. M. Razi, M. Othman, and H. Yaacob, “Investment Decisions Based on EEG Emotion Recognition,” Advanced Science Letters, vol. 23, no. 11, pp. 11 345–11 349, 2017.
- R. Plutchik, “The circumplex as a general model of the structure of emotions and personality.” in Circumplex models of personality and emotions. American Psychological Association, 2004, pp. 17–45.
- S. P. Chatzis, V. Siakoulis, A. Petropoulos, E. Stavroulakis, and N. Vlachogiannakis, “Forecasting stock market crisis events using Deep and statistical Machine Learning techniques,” Expert Systems with Applications, vol. 112, pp. 353–371, 2018.
- M. Al-Smadi, M. Al-Ayyoub, Y. Jararweh, and O. Qawasmeh, “Enhancing Aspect-Based Sentiment Analysis of Arabic Hotels’ reviews using morphological, syntactic and semantic features,” Information Processing & Management, vol. 56, no. 2, pp. 308–319, mar 2019.
- D. Simester, A. Timoshenko, and S. I. Zoumpoulis, “Targeting Prospective Customers: Robustness of Machine-Learning Methods to Typical Data Challenges,” Management Science, pp. 1–43, 2019.
- J. Tuke, A. Nguyen, M. Nasim, D. Mellor, A. Wickramasinghe, N. Bean, and L. Mitchell, “Pachinko Prediction: A Bayesian method for event prediction from social media data,” Information Processing & Management, vol. 57, no. 2, p. 102147, mar 2020.
- L. K. Rickett, “Do Financial Blogs Serve an Infomediary Role in Capital Markets?” American Journal of Business, vol. 31, no. 1, pp. 17–40, 2016.
- W. He, F.-K. Wang, and V. Akula, “Managing extracted knowledge from big social media data for business decision making,” Journal of Knowledge Management, vol. 21, no. 2, pp. 275–294, apr 2017.
- M. Alanyali, H. S. Moat, and T. Preis, “Quantifying the Relationship Between Financial News and the Stock Market,” Scientific Reports, vol. 3, no. 1, pp. 3578–3584, 2013.
- A. Atkins, M. Niranjan, and E. Gerding, “Financial News Predicts Stock Market Volatility Better than Close Price,” The Journal of Finance and Data Science, vol. 4, no. 2, pp. 120–137, 2018.
- M.-Y. Day and C.-C. Lee, “Deep Learning for Financial Sentiment Analysis on Finance News Providers,” in Proceedings of the International Conference on Advances in Social Networks Analysis and Mining. IEEE, 2016, pp. 1127–1134.
- Y. Wang, “Stock Market Forecasting with Financial Micro-blog Based on Sentiment and Time Series Analysis,” Journal of Shanghai Jiaotong University, vol. 22, no. 2, pp. 173–179, 2017.
- E. Ioanăs and I. Stoica, “Social Media and its Impact on Consumers Behavior,” International Journal of Economic Practices and Theories, vol. 4, no. 2, pp. 295–303, 2014.
- A. Sun, M. Lachanski, and F. J. Fabozzi, “Trade the Tweet: Social Media Text Mining and Sparse Matrix Factorization for Stock Market Prediction,” International Review of Financial Analysis, vol. 48, pp. 272–281, 2016.
- F. Ming, F. Wong, Z. Liu, and M. Chiang, “Stock Market Prediction from WSJ: Text Mining via Sparse Matrix Factorization,” in Proceedings of the International Conference on Data Mining. IEEE, 2014, pp. 430–439.
- P. Gibbs, “Time, Temporality and Consumer Behaviour,” European Journal of Marketing, vol. 32, no. 11/12, pp. 993–1007, 1998.
- J. M. Forray and J. Woodilla, “Artefacts of Management Academe,” Time & Society, vol. 14, no. 2-3, pp. 323–339, 2005.
- A. Derakhshan and H. Beigy, “Sentiment analysis on stock social media for stock price movement prediction,” Engineering Applications of Artificial Intelligence, vol. 85, pp. 569–578, oct 2019.
- J. Staiano and M. Guerini, “Depeche Mood: a Lexicon for Emotion Analysis from Crowd Annotated News,” in Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, vol. 2. Association for Computational Linguistics, 2014, pp. 427–433.
- Y. Ge, J. Qiu, Z. Liu, W. Gu, and L. Xu, “Beyond negative and positive: Exploring the effects of emotions in social media during the stock market crash,” Information Processing & Management, vol. 57, no. 4, p. 102218, jul 2020.
- G. Xu, Y. Meng, X. Qiu, Z. Yu, and X. Wu, “Sentiment Analysis of Comment Texts Based on BiLSTM,” IEEE Access, vol. 7, pp. 51 522–51 532, 2019.
- Y. Fang, H. Tan, and J. Zhang, “Multi-Strategy Sentiment Analysis of Consumer Reviews Based on Semantic Fuzziness,” IEEE Access, vol. 6, pp. 20 625–20 631, 2018.
- A. Balahur and J. M. Perea-Ortega, “Sentiment Analysis System Adaptation for Multilingual Processing: The Case of Tweets,” Information Processing & Management, vol. 51, no. 4, pp. 547–556, 2015.
- J. Bučar, J. Povh, and M. Žnidaršič, “Sentiment Classification of the Slovenian News Texts,” in Proceedings of the 9th International Conference on Computer Recognition Systems. Springer, 2016, pp. 777–787.
- D. Zimbra, M. Ghiassi, and S. Lee, “Brand-Related Twitter Sentiment Analysis Using Feature Engineering and the Dynamic Architecture for Artificial Neural Networks,” in Proceedings of the 49th Hawaii International Conference on System Sciences. IEEE, 2016, pp. 1930–1938.
- J. Smailović, M. Grčar, N. Lavrač, and M. Žnidaršič, “Predictive Sentiment Analysis of Tweets: A Stock Market Application,” in Lecture Notes in Computer Science. Springer, 2013, vol. 7947, pp. 77–88.
- J. K. Rout, K.-K. R. Choo, A. K. Dash, S. Bakshi, S. K. Jena, and K. L. Williams, “A model for sentiment and emotion analysis of unstructured social media text,” Electronic Commerce Research, vol. 18, no. 1, pp. 181–199, mar 2018.
- C.-H. Chen, W.-P. Lee, and J.-Y. Huang, “Tracking and recognizing emotions in short text messages from online chatting services,” Information Processing & Management, vol. 54, no. 6, pp. 1325–1344, nov 2018.
- A. Neviarouskaya, H. Prendinger, and M. Ishizuka, “Textual Affect Sensing for Sociable and Expressive Online Communication,” in Proceedings of the International Conference on Affective Computing and Intelligent Interaction. Springer, 2007, pp. 218–229.
- J. Xu, Z. Huang, M. Shi, and M. Jiang, “Emotion Detection in E-learning Using Expectation-Maximization Deep Spatial-Temporal Inference Network,” Advances in Intelligent Systems and Computing, vol. 650, pp. 245–252, 2018.
- M. Z. Asghar, A. Khan, K. Khan, H. Ahmad, and I. A. Khan, “COGEMO: Cognitive-Based Emotion Detection from Patient Generated Health Reviews,” Journal of Medical Imaging and Health Informatics, vol. 7, no. 6, pp. 1436–1444, oct 2017.
- S. Z. Bong, K. Wan, M. Murugappan, N. M. Ibrahim, Y. Rajamanickam, and K. Mohamad, “Implementation of wavelet packet transform and non linear analysis for emotion classification in stroke patient using brain signals,” Biomedical Signal Processing and Control, vol. 36, pp. 102–112, jul 2017.
- M. S. Hossain, G. Muhammad, B. Song, M. M. Hassan, A. Alelaiwi, and A. Alamri, “Audio–Visual Emotion-Aware Cloud Gaming Framework,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 25, no. 12, pp. 2105–2118, 2015.
- J. F. Sánchez-Rada, M. Torres, C. A. Iglesias, R. Maestre, and E. Peinado, “A linked data approach to sentiment and emotion analysis of Twitter in the financial domain,” in Joint Proceedings of the Second International Workshop on Semantic Web Enterprise Adoption and Best Practice and Second International Workshop on Finance and Economics on the Semantic Web, vol. 1240. CEUR, 2014, pp. 51–62.
- D. Duxbury, T. Gärling, A. Gamble, and V. Klass, “How emotions influence behavior in financial markets: a conceptual analysis and emotion-based account of buy-sell preferences,” The European Journal of Finance, pp. 1–22, 2020.
- S. F. Pengnate and F. J. Riggins, “The role of emotion in P2P microfinance funding: A sentiment analysis approach,” International Journal of Information Management, vol. 54, p. 102138, 2020.
- M. Fernández-Gavilanes, J. Juncal-Martínez, S. García-Méndez, E. Costa-Montenegro, and F. J. González-Castaño, “Creating Emoji Lexica from Unsupervised Sentiment Analysis of their Descriptions,” Expert Systems with Applications, vol. 103, pp. 74–91, 2018.
- T. Alvarez-López, J. Juncal-Martínez, M. F. Gavilanes, E. Costa-Montenegro, F. J. González-Castaño, H. Cerezo-Costas, and D. Celix-Salgado, “GTI-Gradiant at TASS 2015: A Hybrid Approach for Sentiment Analysis in Twitter,” in Proceedings of the Workshop on Semantic Analysis at the International Conference of the Spanish Society for Language Processing. CEUR, 2015, pp. 35–40.
- A. Mehmood, B.-W. On, I. Lee, I. Ashraf, and G. Sang Choi, “Spam comments prediction using stacking with ensemble learning,” Journal of Physic, vol. 933, p. 012012, jan 2018.
- Y. Wang, S. Liu, S. Li, J. Duan, Z. Hou, J. Yu, and K. Ma, “Stacking-Based Ensemble Learning of Self-Media Data for Marketing Intention Detection,” Future Internet, vol. 11, no. 7, p. 155, 2019.
- S. R. Harsule and M. K. Nighot, “N-Gram Classifier System to Filter Spam Messages from OSN User Wall,” in Advances in Intelligent Systems and Computing. Springer, 2016, pp. 21–28.
- S. Bajaj, N. Garg, and S. K. Singh, “A Novel User-based Spam Review Detection,” Procedia Computer Science, vol. 122, pp. 1009–1015, 2017.
- S. Temma, M. Sugii, and H. Matsuno, “The Document Similarity Index based on the Jaccard Distance for Mail Filtering,” in Proceedings of the 34th International Technical Conference on Circuits/Systems, Computers and Communications. IEEE, 2019, pp. 1–4.
- S. García-Méndez, M. Fernández-Gavilanes, E. Costa-Montenegro, J. Juncal-Martínez, and F. J. González-Castaño, “Automatic Natural Language Generation Applied to Alternative and Augmentative Communication for Online Video Content Services using SimpleNLG for Spanish,” in Proceedings of the Internet of Accessible Things. ACM, apr 2018, pp. 1–4.
- S. García-Méndez, M. Fernández-Gavilanes, E. Costa-Montenegro, J. Juncal-Martínez, and F. Javier González-Castaño, “A library for automatic Natural Language Generation of Spanish texts,” Expert Systems with Applications, vol. 120, pp. 372–386, apr 2019.
- J. Jurgovsky, M. Granitzer, K. Ziegler, S. Calabretto, P.-E. Portier, L. He-Guelton, and O. Caelen, “Sequence classification for credit-card fraud detection,” Expert Systems with Applications, vol. 100, pp. 234–245, jun 2018.
- N. Papernot, P. McDaniel, S. Jha, M. Fredrikson, Z. B. Celik, and A. Swami, “The Limitations of Deep Learning in Adversarial Settings,” in Proceedings of the IEEE European Symposium on Security and Privacy. IEEE, 2015, pp. 372–387.
- R. Keshari, S. Ghosh, S. Chhabra, M. Vatsa, and R. Singh, “Unravelling Small Sample Size Problems in the Deep Learning World,” in 2020 IEEE Sixth International Conference on Multimedia Big Data. IEEE, 2020, pp. 134–143.
- M. Abdul-Mageed and L. Ungar, “EmoNet: Fine-Grained Emotion Detection with Gated Recurrent Neural Networks,” in Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 2017, pp. 718–728.
- L. Santamaria-Granados, M. Munoz-Organero, G. Ramirez-Gonzalez, E. Abdulhay, and N. Arunkumar, “Using Deep Convolutional Neural Network for Emotion Detection on a Physiological Signals Dataset (AMIGOS),” IEEE Access, vol. 7, pp. 57–67, 2019.
- R. Luss and S. Rosset, “Generalized Isotonic Regression,” Journal of Computational and Graphical Statistics, vol. 23, no. 1, pp. 192–210, 2014.