User-level sentiment analysis incorporating social networks (1109.6018v1)

Published 27 Sep 2011 in cs.CL, cs.IR, physics.data-an, and physics.soc-ph

Abstract: We show that information about social relationships can be used to improve user-level sentiment analysis. The main motivation behind our approach is that users that are somehow "connected" may be more likely to hold similar opinions; therefore, relationship information can complement what we can extract about a user's viewpoints from their utterances. Employing Twitter as a source for our experimental data, and working within a semi-supervised framework, we propose models that are induced either from the Twitter follower/followee network or from the network in Twitter formed by users referring to each other using "@" mentions. Our transductive learning results reveal that incorporating social-network information can indeed lead to statistically significant sentiment-classification improvements over the performance of an approach based on Support Vector Machines having access only to textual features.

Authors (6)

Chenhao Tan (89 papers)
Lillian Lee (40 papers)
Jie Tang (302 papers)
Long Jiang (26 papers)
Ming Zhou (182 papers)
Ping Li (421 papers)

Citations (477)

View on Semantic Scholar

Summary

User-Level Sentiment Analysis Incorporating Social Networks

The research presented explores the integration of social network information into user-level sentiment analysis, particularly focusing on Twitter as a data source. By leveraging user connections, such as the follower/followee relationships and "@-mentions," this paper demonstrates enhancements in sentiment classification beyond traditional text-based approaches.

Sentiment analysis itself has grown critical in interpreting vast user-generated content online. Existing approaches have primarily emphasized document- or tweet-level analyses, often disregarding the relational data available in social platforms. The work spearheaded by Tan et al. breaks new ground by incorporating social linkages, guided by the principle of homophily—the propensity for connected users to share similar sentiments.

Methodology and Models

The paper employs a semi-supervised framework using transductive learning. Two primary social networks inform the model: the follower/followee graph and the "@-mention" network. These networks serve to define sentiment dependencies between users. The model operates on a factor-graph basis, integrating both textual and network data. Using this graphical representation, the authors optimize sentiment classification accuracy by modeling user-user and user-tweet relationships as dependent factors.

Parameter estimation for the model employs both a simple statistical estimation and a more involved SampleRank approach, suited for semi-supervised learning scenarios. This dual strategy allows the model to adaptively weight the contribution of sparse and noise-prone tweet data against the relatively richer user relationship data.

Quantitative Insights

Experiments reveal that incorporating social network information can lead to statistically significant improvements in sentiment classification outcomes compared to text-only approaches. These enhancements are most notable when using the directed follower/followee graphs, indicating that the impression or approval captured in these networks is more predictive of shared sentiment than mutual or @-mention connections alone. Variations in performance across different topic domains highlighted the importance of edge quality, with certain topics showing greater correlation between connected users and sentiment alignment.

Implications and Future Directions

Practically, the integration of social network data into sentiment analysis systems presents potential advancements for applications in marketing, political strategy, and social science research. The research underscores the value of relational data in enhancing the interpretability and accuracy of sentiment prediction models.

Theoretically, the results facilitate further exploration into how social behaviors and structures can inform machine learning tasks, offering a multidimensional approach to sentiment analysis. Future research might evolve to compare these relational models across platforms beyond Twitter, analyze larger, more diverse datasets, or even develop more complex models that can handle denser networks.

Summatively, this work articulates a path forward, leveraging the interconnectivity intrinsic to social networks to refine sentiment analysis techniques, promising new avenues for development in artificial intelligence and data science.

PDF Markdown

Related Papers

Find Related Papers