Exposing the Functionalities of Neurons for Gated Recurrent Unit Based Sequence-to-Sequence Model (2303.15072v1)

Published 27 Mar 2023 in cs.NE, cs.AI, and cs.LG

Abstract: The goal of this paper is to report certain scientific discoveries about a Seq2Seq model. It is known that analyzing the behavior of RNN-based models at the neuron level is considered a more challenging task than analyzing a DNN or CNN models due to their recursive mechanism in nature. This paper aims to provide neuron-level analysis to explain why a vanilla GRU-based Seq2Seq model without attention can achieve token-positioning. We found four different types of neurons: storing, counting, triggering, and outputting and further uncover the mechanism for these neurons to work together in order to produce the right token in the right position.