Beyond Word Importance: Contextual Decomposition to Extract Interactions from LSTMs (1801.05453v2)

Published 16 Jan 2018 in cs.CL, cs.LG, and stat.ML

Abstract: The driving force behind the recent success of LSTMs has been their ability to learn complex and non-linear relationships. Consequently, our inability to describe these relationships has led to LSTMs being characterized as black boxes. To this end, we introduce contextual decomposition (CD), an interpretation algorithm for analysing individual predictions made by standard LSTMs, without any changes to the underlying model. By decomposing the output of a LSTM, CD captures the contributions of combinations of words or variables to the final prediction of an LSTM. On the task of sentiment analysis with the Yelp and SST data sets, we show that CD is able to reliably identify words and phrases of contrasting sentiment, and how they are combined to yield the LSTM's final prediction. Using the phrase-level labels in SST, we also demonstrate that CD is able to successfully extract positive and negative negations from an LSTM, something which has not previously been done.

Citations (205)

View on Semantic Scholar

Summary

The paper introduces the CD algorithm that dissects LSTM outputs into interpretable components, quantifying phrase-level contributions akin to logistic regression coefficients.
It leverages LSTM gating mechanisms to isolate contributions from target phrases and contextual interactions, enhancing transparency in deep learning models.
Empirical evaluations on SST and Yelp datasets validate CD's ability to uncover semantic dynamics, improving sentiment prediction insights in NLP.

Contextual Decomposition: An Interpretative Approach for LSTMs

This paper introduces a new interpretative method known as Contextual Decomposition (CD), which is particularly designed to elucidate the complex decision processes of Long Short Term Memory (LSTM) networks. LSTMs, pivotal in NLP, offer superior performance by capturing non-linear relationships among features. However, this capability also renders them opaque or 'black-box' systems. CD offers a transparent lens to examine the roles of particular words or phrases in LSTM predictions without altering their architecture.

Contribution and Methodology

The primary contribution of this paper is the CD algorithm, which dissects an LSTM's output into interpretable components reflecting the contribution of specific words or phrases. Unlike previous efforts focusing solely on word-level importance, this method provides insight into the interplay among variables within the LSTM architecture, leveraging the recurrent nature of LSTM's gating mechanisms to parse interactions.

The paper elucidates the mechanics of CD by decomposing the cell and state vectors, crucial to LSTM functioning, into two components: (i) contributions exclusively from a target phrase, and (ii) those formed by interactions with the surrounding context. This breakdown enables quantification of an individual phrase's influence on the predictive outcome, analogous to logistic regression coefficients, thus adding a layer of interpretability to LSTMs.

Empirical Evaluation

The efficacy of CD is demonstrated through sentiment analysis tasks utilizing the Stanford Sentiment Treebank (SST) and Yelp Polarity datasets. Through these experiments, CD showcases its capability in distinguishing words and phrases with distinct sentiments, correctly identifying positive and negative negations—a feat not accomplished by prior interpretation techniques.

Quantitatively, CD's word-level scores exhibit strong correlation with logistic regression coefficients, providing a measure of validation against simpler, interpretable models. Moreover, CD uncovers semantic dynamics within the data, such as the compositional negation of sentiment, underscoring its advantage over existing techniques like Integrated Gradients and Leave-One-Out.

Implications and Future Directions

CD advances the interpretability of LSTMs significantly, offering a tool not only for academic inquiry but potentially aiding practitioners in refining model transparency and trustworthiness. Its methodology can shed light on whether the circumstantial behavior of neural models aligns with human intuition and expectations, a crucial step in the responsible development of AI systems.

Looking forward, the principles of CD could harmonize with other neural architectures beyond LSTMs, especially in deciphering complex decision systems synonymous with deep learning. Furthermore, CD can bridge gaps in aligning model interpretability with real-world applications, refining systems where human oversight and understanding of AI decisions are paramount.

In conclusion, by enhancing our understanding of the nuanced interactions within LSTM models, CD stands as a vital development in the growing repository of tools focused on AI transparency and interpretability, with promising avenues for expansion and application in diverse AI paradigms.

PDF Markdown

Related Papers

GitHub

GitHub - jamie-murdoch/ContextualDecomposition: Demo for method introduced in "Beyond Word Importance: Contextual Decomposition to Extract Interactions from LSTMs" (56 stars)

YouTube

Show All Videos