"What is Relevant in a Text Document?": An Interpretable Machine Learning Approach (1612.07843v1)

Published 23 Dec 2016 in cs.CL, cs.IR, cs.LG, and stat.ML

Abstract: Text documents can be described by a number of abstract concepts such as semantic category, writing style, or sentiment. Machine learning (ML) models have been trained to automatically map documents to these abstract concepts, allowing to annotate very large text collections, more than could be processed by a human in a lifetime. Besides predicting the text's category very accurately, it is also highly desirable to understand how and why the categorization process takes place. In this paper, we demonstrate that such understanding can be achieved by tracing the classification decision back to individual words using layer-wise relevance propagation (LRP), a recently developed technique for explaining predictions of complex non-linear classifiers. We train two word-based ML models, a convolutional neural network (CNN) and a bag-of-words SVM classifier, on a topic categorization task and adapt the LRP method to decompose the predictions of these models onto words. Resulting scores indicate how much individual words contribute to the overall classification decision. This enables one to distill relevant information from text documents without an explicit semantic information extraction step. We further use the word-wise relevance scores for generating novel vector-based document representations which capture semantic information. Based on these document vectors, we introduce a measure of model explanatory power and show that, although the SVM and CNN models perform similarly in terms of classification accuracy, the latter exhibits a higher level of explainability which makes it more comprehensible for humans and potentially more useful for other applications.

PDF Abstract

Interpretable Machine Learning in Text Classification: The Role of Relevance

The paper "What is Relevant in a Text Document?: An Interpretable Machine Learning Approach" introduces an interpretable approach to understanding machine learning decisions in text classification. The work primarily focuses on applying the Layer-wise Relevance Propagation (LRP) technique to decompose predictions of text classification models onto individual words, providing insights into what makes a model decide as it does.

Key Contributions and Findings

Application of LRP in NLP: The authors successfully adapt LRP, a technique originally developed for image categorization, to NLP. This adaptation helps trace the predictions of both convolutional neural networks (CNNs) and bag-of-words support vector machines (BoW/SVMs) back to individual words, highlighting the relevance of each word in the classification task.
Neural Network vs. SVM: The paper compares two models - a CNN and a BoW/SVM - in their ability to classify text documents and provide interpretable explanations of their predictions. Both models exhibited similar performance in classification accuracy. However, the CNN demonstrated a superior level of explainability when evaluated through a novel measure of model explanatory power, which is indicative of its higher interpretability.
Document Representation: The researchers introduce a method to generate vector-based document representations (document summary vectors) by leveraging word-wise relevance scores obtained from LRP. These representations effectively capture semantic information and possess semantic regularities that are advantageous for tasks like K-nearest neighbor classification.
Intrinsic and Extrinsic Validation: The paper employs robust validation techniques to assess the quality of word relevance identified by LRP. Intrinsic validation involved deleting words based on their relevance in order of significance, examining changes in classification accuracy. For extrinsic validation, they introduce an explanatory power index (EPI) using KNN classification on document summary vectors—a metric that reflects a model’s semantic extractive capabilities.

Implications and Future Directions

The implications of this research extend both theoretically and practically in the field of NLP and machine learning.

Improving Model Interpretability: The ability to trace classification outcomes to specific input words enhances the interpretability of complex NLP models. This advancement is likely to aid in building trust in automated systems by providing transparent decision-making processes, especially in sensitive applications like healthcare or legal systems.
Advancements in Representation Learning: The creation of document summary vectors presents new opportunities in representation learning by proposing a pipeline that bridges the gap between word-level embeddings and document-level semantics. This approach encourages further exploration into refining semantic vector spaces for more nuanced understanding and retrieval of text data.
Guiding Model Design: The insight into relevant features can facilitate targeted model adjustments and feature engineering, leading to more efficient architectures that maintain performance while reducing complexity.

Moving forward, future research might explore the application of LRP to other model architectures such as transformer-based models, which have predominantly shown superiority in many NLP tasks. Additionally, investigating the extension of these interpretable techniques to multilingual contexts or other data modalities could uncover applications beyond text processing, offering interpretability solutions to a more extensive range of AI challenges.

PDF Markdown Bookmark Chat (Pro)

Authors (5)

Leila Arras (9 papers)
Franziska Horn (11 papers)
Grégoire Montavon (50 papers)
Klaus-Robert Müller (167 papers)
Wojciech Samek (144 papers)

Citations (282)

View on Semantic Scholar

"What is Relevant in a Text Document?": An Interpretable Machine Learning Approach (1612.07843v1)

Interpretable Machine Learning in Text Classification: The Role of Relevance

Key Contributions and Findings

Implications and Future Directions

Related Papers