Sentiment Analysis of Twitter Data: A Survey of Techniques (1601.06971v3)

Published 26 Jan 2016 in cs.CL

Abstract: With the advancement of web technology and its growth, there is a huge volume of data present in the web for internet users and a lot of data is generated too. Internet has become a platform for online learning, exchanging ideas and sharing opinions. Social networking sites like Twitter, Facebook, Google+ are rapidly gaining popularity as they allow people to share and express their views about topics,have discussion with different communities, or post messages across the world. There has been lot of work in the field of sentiment analysis of twitter data. This survey focuses mainly on sentiment analysis of twitter data which is helpful to analyze the information in the tweets where opinions are highly unstructured, heterogeneous and are either positive or negative, or neutral in some cases. In this paper, we provide a survey and a comparative analyses of existing techniques for opinion mining like machine learning and lexicon-based approaches, together with evaluation metrics. Using various machine learning algorithms like Naive Bayes, Max Entropy, and Support Vector Machine, we provide a research on twitter data streams.General challenges and applications of Sentiment Analysis on Twitter are also discussed in this paper.

View on arXiv

Authors (2)

Vishal. A. Kharde (1 paper)
Prof. Sheetal. Sonawane (1 paper)

Citations (538)

View on Semantic Scholar

Summary

Overview of Sentiment Analysis Techniques on Twitter Data

The paper "Sentiment Analysis of Twitter Data: A Survey of Techniques" by Kharde and Sonawane presents a comprehensive examination of methodologies for analyzing opinions expressed in tweets. The increasing volume of sentiment-rich content on social media platforms like Twitter necessitates the development and refinement of sentiment analysis (SA) techniques. Sentiment analysis operates by classifying textual data into positive, negative, or neutral sentiments, leveraging NLP tools and methodologies.

Sentiment Analysis Techniques

The paper categorizes sentiment analysis techniques into machine learning-based approaches and lexicon-based methods.

Machine Learning Approaches:
- Supervised Learning: Key techniques include Naive Bayes (NB), Support Vector Machines (SVM), and Maximum Entropy (MaxEnt). The researchers detail the deployment of these models on labeled datasets of tweets, indicating that features such as unigrams, bigrams, part-of-speech tags, and hashtags significantly influence the classifier performance.
- Unsupervised Learning: This approach typically utilizes clustering techniques to indirectly infer sentiment without labeled data, though this is less emphasized in the paper.
Lexicon-Based Approaches:
- The lexicon-based approach depends on precompiled lists of polarity-assigned words, such as SentiWordNet. These methods are domain-independent but require comprehensive dictionaries to handle vernacular and context-specific terms prevalent in Twitter data.
Hybrid Methods: While less emphasized, combining machine learning with lexicon-based techniques could enhance sentiment classification accuracy, addressing domain specificity and lexical variations.

Evaluation Metrics and Challenges

The paper stresses the need for robust evaluation using metrics such as accuracy, precision, recall, and F1-score. The authors note various challenges inherent in sentiment analysis, notably dealing with sarcasm, contextual ambiguity, and handling noisy data such as misspellings and non-standard grammar typical of social media text.

Implications and Future Directions

The paper underscores the practical applications of sentiment analysis in domains like business intelligence, recommendation systems, and social media monitoring. The authors speculate on improving SA by integrating cross-lingual techniques, given the multilingual nature of Twitter users. The increasing importance of adaptive methods suited to dynamic and real-time data streams is also acknowledged as a future avenue of research.

In summary, the survey by Kharde and Sonawane provides a detailed exposition of current sentiment analysis methodologies applicable to Twitter data. It acknowledges existing challenges and advocates for advancements that blend different analytical techniques, aiming to enhance the accuracy and applicability of sentiment classification in diverse, real-world settings. The survey lays a foundation for future research, encouraging the exploration and integration of more nuanced features and sophisticated algorithms in sentiment analysis endeavors.

PDF Markdown

Related Papers

Find Related Papers

YouTube

Show All Videos