Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Sentiment Analysis of Twitter Data for Predicting Stock Market Movements (1610.09225v1)

Published 28 Oct 2016 in cs.IR, cs.CL, and cs.SI

Abstract: Predicting stock market movements is a well-known problem of interest. Now-a-days social media is perfectly representing the public sentiment and opinion about current events. Especially, twitter has attracted a lot of attention from researchers for studying the public sentiments. Stock market prediction on the basis of public sentiments expressed on twitter has been an intriguing field of research. Previous studies have concluded that the aggregate public mood collected from twitter may well be correlated with Dow Jones Industrial Average Index (DJIA). The thesis of this work is to observe how well the changes in stock prices of a company, the rises and falls, are correlated with the public opinions being expressed in tweets about that company. Understanding author's opinion from a piece of text is the objective of sentiment analysis. The present paper have employed two different textual representations, Word2vec and N-gram, for analyzing the public sentiments in tweets. In this paper, we have applied sentiment analysis and supervised machine learning principles to the tweets extracted from twitter and analyze the correlation between stock market movements of a company and sentiments in tweets. In an elaborate way, positive news and tweets in social media about a company would definitely encourage people to invest in the stocks of that company and as a result the stock price of that company would increase. At the end of the paper, it is shown that a strong correlation exists between the rise and falls in stock prices with the public sentiments in tweets.

Sentiment Analysis of Twitter Data for Predicting Stock Market Movements

The paper presents an empirical paper exploring the correlation between public sentiment expressed on Twitter and fluctuations in stock market prices. It deploys sentiment analysis and supervised machine learning techniques to analyze tweets specifically related to a company's market behavior, with Microsoft serving as the case paper. The authors introduce a novel sentiment analyzer, employing both Word2Vec and N-gram textual representations, to classify tweet sentiment into positive, negative, and neutral categories. The classifier trained using a human-annotated dataset achieved an accuracy rate comparable to the observed human concordance rates in sentiment classification.

Methodology

The research utilizes a dataset comprising 250,000 tweets gathered over a year, targeting specific stock and company-related keywords to capture public opinion accurately. The stock price data, sourced from Yahoo! Finance, were aligned with the sentiment data to evaluate correlations. Data preprocessing was conducted rigorously, involving tokenization, stopword removal, and regex matching, ensuring the tweets reflect true sentiment without noise from unrelated data elements like URLs and emoticons.

The sentiment classification problem was approached using machine learning models, with features extracted via Word2Vec and N-gram methods. The Word2Vec approach was ultimately favored due to its robustness in preserving semantic relationships within text data. The correlation between public sentiment and stock price movements was analyzed by examining daily closing prices against sentiment scores, revealing a significant relationship warranting closer examination.

Results

The sentiment classification model, trained on 3,216 manually annotated tweets, showcased an accuracy of over 70% when employing Word2Vec, with slightly higher performance using N-gram. Importantly, these accuracy rates are in line with typical human sentiment concordance, underscoring the model's reliability in this application domain. Correlation analysis indicated that sentiment patterns preceding stock price changes could predict stock movements with an accuracy rate exceeding 69% using logistic regression and further improved using a LibSVM approach.

Implications

This paper reinforces the potential of incorporating real-time social media sentiment analysis in financial market models. By providing an efficient sentiment analysis tool, the paper broadens the understanding of public opinion as an actionable factor in stock market predictions. Analysts and investors could leverage such insights to enhance decision-making processes, potentially increasing predictive accuracy over mere reliance on historical price data.

Future Work

Future efforts might include expanding sentiment data sources to other platforms like StockTwits and integrating conventional news sources to provide a more comprehensive measure of public sentiment. Expanding the manually annotated training dataset could further enhance model performance. These steps may refine sentiment analysis methodologies, making them even more valuable for financial forecasting.

In conclusion, the research provides a compelling case for the use of social media sentiment as an indicator of stock market movements, demonstrating significant progress in the integration of natural language processing techniques and machine learning in financial prediction models. This paper opens avenues for further exploration into more extensive datasets and additional data sources, laying groundwork for future advancements in stock market prediction models.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Venkata Sasank Pagolu (3 papers)
  2. Kamal Nayan Reddy Challa (3 papers)
  3. Ganapati Panda (2 papers)
  4. Babita Majhi (2 papers)
Citations (337)