EmTract: Extracting Emotions from Social Media (2112.03868v3)
Abstract: We develop an open-source tool (EmTract) that extracts emotions from social media text tailed for financial context. To do so, we annotate ten thousand short messages from a financial social media platform (StockTwits) and combine it with open-source emotion data. We then use a pre-tuned NLP model, DistilBERT, augment its embedding space by including 4,861 tokens (emojis and emoticons), and then fit it first on the open-source emotion data, then transfer it to our annotated financial social media data. Our model outperforms competing open-source state-of-the-art emotion classifiers, such as Emotion English DistilRoBERTa-base on both human and chatGPT annotated data. Compared to dictionary based methods, our methodology has three main advantages for research in finance. First, our model is tailored to financial social media text; second, it incorporates key aspects of social media data, such as non-standard phrases, emojis, and emoticons; and third, it operates by sequentially learning a latent representation that includes features such as word order, word usage, and local context. Using EmTract, we explore the relationship between investor emotions expressed on social media and asset prices. We show that firm-specific investor emotions are predictive of daily price movements. Our findings show that emotions and market dynamics are closely related, and we provide a tool to help study the role emotions play in financial markets.
- \APACrefYearMonthDay2019. \BBOQ\APACrefatitlePredicting consumer default: A deep learning approach Predicting consumer default: A deep learning approach.\BBCQ \APACjournalVolNumPagesNational Bureau of Economic Research. \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2016. \BBOQ\APACrefatitleBubbling with excitement: an experiment Bubbling with excitement: an experiment.\BBCQ \APACjournalVolNumPagesReview of Finance202447–466. \PrintBackRefs\CurrentBib
- \APACinsertmetastararaci2019finbert{APACrefauthors}Araci, D. \APACrefYearMonthDay2019. \BBOQ\APACrefatitleFinbert: Financial sentiment analysis with pre-trained language models Finbert: Financial sentiment analysis with pre-trained language models.\BBCQ \APACjournalVolNumPagesarXiv preprint arXiv:1908.10063. \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2007. \BBOQ\APACrefatitleInvestor sentiment in the stock market Investor sentiment in the stock market.\BBCQ \APACjournalVolNumPagesJournal of economic perspectives212129–152. \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2011. \BBOQ\APACrefatitleTwitter mood predicts the stock market Twitter mood predicts the stock market.\BBCQ \APACjournalVolNumPagesJournal of computational science211–8. \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2018. \BBOQ\APACrefatitleAn Analysis of Annotated Corpora for Emotion Classification in Text An analysis of annotated corpora for emotion classification in text.\BBCQ \BIn \APACrefbtitleProceedings of the 27th International Conference on Computational Linguistics Proceedings of the 27th international conference on computational linguistics (\BPGS 2104–2119). \APACaddressPublisherAssociation for Computational Linguistics. {APACrefURL} http://aclweb.org/anthology/C18-1179 \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2018. \BBOQ\APACrefatitleEmotional state and market behavior Emotional state and market behavior.\BBCQ \APACjournalVolNumPagesReview of Finance221279–309. \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2006. \BBOQ\APACrefatitleModel compression Model compression.\BBCQ \BIn \APACrefbtitleProceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining Proceedings of the 12th acm sigkdd international conference on knowledge discovery and data mining (\BPGS 535–541). \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2021. \BBOQ\APACrefatitleA Simple Robust MPC for Linear Systems with Parametric and Additive Uncertainty A simple robust mpc for linear systems with parametric and additive uncertainty.\BBCQ \BIn \APACrefbtitle2021 American Control Conference (ACC) 2021 american control conference (acc) (\BPG 2108-2113). {APACrefDOI} \doi10.23919/ACC50511.2021.9482957 \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2018. \BBOQ\APACrefatitleAdaptive MPC with Chance Constraints for FIR Systems Adaptive mpc with chance constraints for fir systems.\BBCQ \BIn \APACrefbtitle2018 Annual American Control Conference (ACC) 2018 annual american control conference (acc) (\BPG 2312-2317). {APACrefDOI} \doi10.23919/ACC.2018.8431586 \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2021. \BBOQ\APACrefatitleEmotion and sentiment analysis of tweets using BERT. Emotion and sentiment analysis of tweets using bert.\BBCQ \BIn \APACrefbtitleEDBT/ICDT Workshops. Edbt/icdt workshops. \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2018. \BBOQ\APACrefatitleBert: Pre-training of deep bidirectional transformers for language understanding Bert: Pre-training of deep bidirectional transformers for language understanding.\BBCQ \APACjournalVolNumPagesarXiv preprint arXiv:1810.04805. \PrintBackRefs\CurrentBib
- \APACinsertmetastarekman1992argument{APACrefauthors}Ekman, P. \APACrefYearMonthDay1992. \BBOQ\APACrefatitleAn argument for basic emotions An argument for basic emotions.\BBCQ \APACjournalVolNumPagesCognition & emotion63-4169–200. \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2019. \BBOQ\APACrefatitleTwitter sentiment analysis using natural language toolkit and VADER sentiment Twitter sentiment analysis using natural language toolkit and vader sentiment.\BBCQ \BIn \APACrefbtitleProceedings of the international multiconference of engineers and computer scientists Proceedings of the international multiconference of engineers and computer scientists (\BVOL 122, \BPG 16). \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2010. \BBOQ\APACrefatitleWidespread worry and the stock market Widespread worry and the stock market.\BBCQ \BIn \APACrefbtitleFourth International AAAI Conference on Weblogs and Social Media. Fourth international aaai conference on weblogs and social media. \PrintBackRefs\CurrentBib
- \APACinsertmetastarhartmann2022emotionenglish{APACrefauthors}Hartmann, J. \APACrefYearMonthDay2022. \APACrefbtitleEmotion English DistilRoBERTa-base. Emotion english distilroberta-base. \APAChowpublishedhttps://huggingface.co/j-hartmann/emotion-english-distilroberta-base/. \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2015. \BBOQ\APACrefatitleDistilling the knowledge in a neural network Distilling the knowledge in a neural network.\BBCQ \APACjournalVolNumPagesarXiv preprint arXiv:1503.0253127. \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2004. \BBOQ\APACrefatitleGroups of diverse problem solvers can outperform groups of high-ability problem solvers Groups of diverse problem solvers can outperform groups of high-ability problem solvers.\BBCQ \APACjournalVolNumPagesProceedings of the National Academy of Sciences1014616385–16389. \PrintBackRefs\CurrentBib
- \APACinsertmetastarhunter2007matplotlib{APACrefauthors}Hunter, J\BPBID. \APACrefYearMonthDay2007. \BBOQ\APACrefatitleMatplotlib: A 2D graphics environment Matplotlib: A 2d graphics environment.\BBCQ \APACjournalVolNumPagesComputing in science & engineering9390–95. \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2008. \BBOQ\APACrefatitleIs it the weather? Is it the weather?\BBCQ \APACjournalVolNumPagesJournal of Banking & Finance324526–540. \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2003. \BBOQ\APACrefatitleWinter blues: A SAD stock market cycle Winter blues: A sad stock market cycle.\BBCQ \APACjournalVolNumPagesAmerican Economic Review931324–343. \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2021. \BBOQ\APACrefatitleAn experimental analysis of data annotation methodologies for emotion detection in short text posted on social media An experimental analysis of data annotation methodologies for emotion detection in short text posted on social media.\BBCQ \BIn \APACrefbtitleInformatics Informatics (\BVOL 8, \BPG 19). \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2019. \BBOQ\APACrefatitleRacial Disparities in Debt Collection Racial disparities in debt collection.\BBCQ \APACjournalVolNumPagesAvailable at SSRN 3465203. \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2023. \BBOQ\APACrefatitleCan ChatGPT Forecast Stock Price Movements? Return Predictability and Large Language Models Can chatgpt forecast stock price movements? return predictability and large language models.\BBCQ \APACjournalVolNumPagesarXiv preprint arXiv:2304.07619. \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2017. \BBOQ\APACrefatitleA Unified Approach to Interpreting Model Predictions A unified approach to interpreting model predictions.\BBCQ \BIn I. Guyon \BOthers. (\BEDS), \APACrefbtitleAdvances in Neural Information Processing Systems 30 Advances in neural information processing systems 30 (\BPGS 4765–4774). \APACaddressPublisherCurran Associates, Inc. {APACrefURL} http://papers.nips.cc/paper/7062-a-unified-approach-to-interpreting-model-predictions.pdf \PrintBackRefs\CurrentBib
- \APACinsertmetastarmckinney2010data{APACrefauthors}McKinney, W.\BCBT \BOthersPeriod. \APACrefYearMonthDay2010. \BBOQ\APACrefatitleData structures for statistical computing in python Data structures for statistical computing in python.\BBCQ \BIn \APACrefbtitleProceedings of the 9th Python in Science Conference Proceedings of the 9th python in science conference (\BVOL 445, \BPGS 51–56). \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2021. \BBOQ\APACrefatitleDeep learning-based sentiment analysis and topic modeling on tourism during Covid-19 pandemic Deep learning-based sentiment analysis and topic modeling on tourism during covid-19 pandemic.\BBCQ \APACjournalVolNumPagesFrontiers in Computer Science3775368. \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2019. \BBOQ\APACrefatitlePyTorch: An Imperative Style, High-Performance Deep Learning Library Pytorch: An imperative style, high-performance deep learning library.\BBCQ \BIn H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox\BCBL \BBA R. Garnett (\BEDS), \APACrefbtitleAdvances in Neural Information Processing Systems 32 Advances in neural information processing systems 32 (\BPGS 8024–8035). \APACaddressPublisherCurran Associates, Inc. {APACrefURL} http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2019. \BBOQ\APACrefatitleDistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter.\BBCQ \APACjournalVolNumPagesarXiv preprint arXiv:1910.01108. \PrintBackRefs\CurrentBib
- \APACinsertmetastarschmidhuberdeep{APACrefauthors}Schmidhuber, J. \APACrefYearMonthDay2015. \BBOQ\APACrefatitleDeep learning in neural networks: An overview Deep learning in neural networks: An overview.\BBCQ \APACjournalVolNumPagesNeural networks6185–117. \PrintBackRefs\CurrentBib
- \APACinsertmetastarvamossyemotions{APACrefauthors}Vamossy, D\BPBIF. \APACrefYearMonthDay2021. \BBOQ\APACrefatitleInvestor emotions and earnings announcements Investor emotions and earnings announcements.\BBCQ \APACjournalVolNumPagesJournal of Behavioral and Experimental Finance30100474. \PrintBackRefs\CurrentBib
- \APACinsertmetastarvamossy2023social{APACrefauthors}Vamossy, D\BPBIF. \APACrefYearMonthDay2023. \BBOQ\APACrefatitleSocial Media Emotions and IPO Returns Social media emotions and ipo returns.\BBCQ \APACjournalVolNumPagesAvailable at SSRN 4384573. \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2017. \BBOQ\APACrefatitleAttention is all you need Attention is all you need.\BBCQ \APACjournalVolNumPagesAdvances in neural information processing systems30. \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2011. \BBOQ\APACrefatitleThe NumPy array: a structure for efficient numerical computation The numpy array: a structure for efficient numerical computation.\BBCQ \APACjournalVolNumPagesComputing in Science & Engineering13222–30. \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2019. \BBOQ\APACrefatitleHuggingface’s transformers: State-of-the-art natural language processing Huggingface’s transformers: State-of-the-art natural language processing.\BBCQ \APACjournalVolNumPagesarXiv preprint arXiv:1910.03771. \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2021. \BBOQ\APACrefatitleEmotion detection of textual data: An interdisciplinary survey Emotion detection of textual data: An interdisciplinary survey.\BBCQ \BIn \APACrefbtitle2021 IEEE World AI IoT Congress (AIIoT) 2021 ieee world ai iot congress (aiiot) (\BPGS 0255–0261). \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2021. \BBOQ\APACrefatitleDive into deep learning Dive into deep learning.\BBCQ \APACjournalVolNumPagesarXiv preprint arXiv:2106.11342. \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2021. \BBOQ\APACrefatitleNear-Optimal Rapid MPC Using Neural Networks: A Primal-Dual Policy Learning Framework Near-optimal rapid mpc using neural networks: A primal-dual policy learning framework.\BBCQ \APACjournalVolNumPagesIEEE Transactions on Control Systems Technology2952102-2114. {APACrefDOI} \doi10.1109/TCST.2020.3024571 \PrintBackRefs\CurrentBib