SPINE: SParse Interpretable Neural Embeddings (1711.08792v1)

Published 23 Nov 2017 in cs.CL

Abstract: Prediction without justification has limited utility. Much of the success of neural models can be attributed to their ability to learn rich, dense and expressive representations. While these representations capture the underlying complexity and latent trends in the data, they are far from being interpretable. We propose a novel variant of denoising k-sparse autoencoders that generates highly efficient and interpretable distributed word representations (word embeddings), beginning with existing word representations from state-of-the-art methods like GloVe and word2vec. Through large scale human evaluation, we report that our resulting word embedddings are much more interpretable than the original GloVe and word2vec embeddings. Moreover, our embeddings outperform existing popular word embeddings on a diverse suite of benchmark downstream tasks.

Authors (5)

Anant Subramanian (1 paper)
Danish Pruthi (28 papers)
Harsh Jhamtani (26 papers)
Taylor Berg-Kirkpatrick (106 papers)
Eduard Hovy (115 papers)

Citations (114)

View on Semantic Scholar

Summary

"SPINE: SParse Interpretable Neural Embeddings" addresses a critical issue in the domain of neural networks: the interpretability of neural representations. While neural models have achieved considerable success due to their ability to learn complex, dense, and expressive representations, these representations are often opaque and difficult to interpret. This lack of interpretability limits the utility of such models, especially in applications where understanding the underlying reasoning behind predictions is crucial.

The authors propose a novel variant of denoising k-sparse autoencoders to create highly efficient and interpretable distributed word representations, also known as word embeddings. The key innovation in this work is the generation of sparse and interpretable embeddings without sacrificing performance on downstream tasks.

The method introduced in the paper begins with existing pre-trained word embeddings from state-of-the-art methods like GloVe and word2vec. By applying their variant of denoising k-sparse autoencoders, the authors manage to produce embeddings that retain the rich informational content of the original embeddings while enhancing their interpretability.

Significantly, the paper emphasizes the human aspect of evaluation. Through large-scale human evaluations, the researchers demonstrate that the word embeddings generated by their method are much more interpretable compared to the original GloVe and word2vec embeddings. Human evaluators were better able to discern and understand the latent structures within the data represented by these embeddings.

Beyond human interpretability, these sparse embeddings also maintained or even surpassed the performance of the original embeddings on a diverse suite of benchmark downstream tasks. This important finding suggests that enhancing interpretability does not necessarily come at the cost of performance, offering a promising direction for future research in neural representations.

The implication of this work is substantial for various applications of machine learning and natural language processing. The ability to generate interpretable word embeddings can lead to more transparent models, which is invaluable in fields like healthcare, finance, and any other domain where understanding the model's decisions is as important as the decisions themselves.

In summary, "SPINE" contributes a significant advancement in creating interpretable neural embeddings, providing a balanced solution that does not compromise on effectiveness while enhancing the understandability of the learned representations.

PDF Markdown

Related Papers

Find Related Papers