Dynamic Self-Attention : Computing Attention over Words Dynamically for Sentence Embedding

Published 22 Aug 2018 in cs.LG, cs.CL, and stat.ML | (1808.07383v1)

Abstract: In this paper, we propose Dynamic Self-Attention (DSA), a new self-attention mechanism for sentence embedding. We design DSA by modifying dynamic routing in capsule network (Sabouretal.,2017) for natural language processing. DSA attends to informative words with a dynamic weight vector. We achieve new state-of-the-art results among sentence encoding methods in Stanford Natural Language Inference (SNLI) dataset with the least number of parameters, while showing comparative results in Stanford Sentiment Treebank (SST) dataset.