Better Document-level Sentiment Analysis from RST Discourse Parsing (1509.01599v2)

Published 4 Sep 2015 in cs.CL and cs.AI

Abstract: Discourse structure is the hidden link between surface features and document-level properties, such as sentiment polarity. We show that the discourse analyses produced by Rhetorical Structure Theory (RST) parsers can improve document-level sentiment analysis, via composition of local information up the discourse tree. First, we show that reweighting discourse units according to their position in a dependency representation of the rhetorical structure can yield substantial improvements on lexicon-based sentiment analysis. Next, we present a recursive neural network over the RST structure, which offers significant improvements over classification-based methods.

PDF Abstract

Better Document-level Sentiment Analysis from RST Discourse Parsing

The paper "Better Document-level Sentiment Analysis from RST Discourse Parsing" by Parminder Bhatia, Yangfeng Ji, and Jacob Eisenstein presents an innovative approach to enhancing document-level sentiment analysis by utilizing discourse structure, specifically leveraging Rhetorical Structure Theory (RST) discourse parsing. The authors propose that analyzing the discourse relations within a document can significantly enhance sentiment analysis by focusing not only on surface lexical features but also on how different discourse units interact in a hierarchical structure. This paper is a commendable effort to integrate well-established discourse parsing techniques into sentiment analysis algorithms, allowing for improvements in both lexicon-based and classifier-driven sentiment analysis methods.

Methodology

The paper introduces two methodologies to incorporate discourse information into sentiment analysis:

Discourse Depth Reweighting: This method weights the contribution of each elementary discourse unit (EDU) based on its position within a dependency-like representation of the discourse structure. The authors utilize a simple linear function to assign weights, which potentially enhance the sentiment analysis performance by emphasizing crucial discourse units.
Recursive Neural Network over RST Structure: Inspired by recursive neural networks, this second approach recursively propagates sentiment information throughout the RST tree. This compositional method involves recursively constructing sentiment scores at each level of the discourse hierarchy, ultimately deriving the sentiment at the root of the document.

These methodologies were evaluated using two movie review datasets. The discourse depth reweighting shows considerable improvement in sentiment prediction, especially for lexicon-based methods, suggesting the benefit of discourse structure in this context. The recursive neural network, though more complex, was shown to outperform bag-of-words classifiers, providing evidence for the efficacy of compositional sentiment analysis over RST structures.

Key Results and Implications

The numerical results presented highlight significant improvements over conventional sentiment analysis approaches. For example, the discourse depth reweighting yielded a raw accuracy improvement of 4-5% for lexicon-based sentiment analysis approaches.

The recursive neural network architecture showed a potential increase of 3% in accuracy in one dataset compared to the baseline. This suggests that recursive methods that incorporate document-level discourse structures are beneficial for sentiment classification tasks.

Implications and Future Directions

These findings not only demonstrate that incorporating RST discourse structure into sentiment analysis can improve accuracy but also suggest further lines of inquiry and potential applications. Practically, this research implies that sentiment analysis tools could be augmented by discourse parsing techniques, providing more nuanced sentiment detection that could offer significant advantages in domains such as social media analysis, customer reviews, and opinion mining.

Theoretically, this research opens the door to more sophisticated models that integrate various linguistic dimensions, including semantic and pragmatic elements, into sentiment analysis. As discourse parsers continue to improve, it is plausible that models could accurately capture not only sentiment polarity but also complex emotional states and other nuanced linguistic features.

In conclusion, this paper extends the methodological toolkit available for sentiment analysis by proving the utility of RST and recursive methods. Future research may explore further enhancement of discourse parsing systems, the application of these methods in multilingual contexts, and the coupling of sentiment analysis with other document-level characteristics such as politeness or emotion detection.

PDF Markdown Bookmark Chat (Pro)

Authors (3)

Parminder Bhatia (50 papers)
Yangfeng Ji (59 papers)
Jacob Eisenstein (73 papers)

Citations (164)

View on Semantic Scholar

Better Document-level Sentiment Analysis from RST Discourse Parsing (1509.01599v2)