Efficient Conformer with Prob-Sparse Attention Mechanism for End-to-EndSpeech Recognition (2106.09236v1)

Published 17 Jun 2021 in cs.SD and eess.AS

Abstract: End-to-end models are favored in automatic speech recognition (ASR) because of their simplified system structure and superior performance. Among these models, Transformer and Conformer have achieved state-of-the-art recognition accuracy in which self-attention plays a vital role in capturing important global information. However, the time and memory complexity of self-attention increases squarely with the length of the sentence. In this paper, a prob-sparse self-attention mechanism is introduced into Conformer to sparse the computing process of self-attention in order to accelerate inference speed and reduce space consumption. Specifically, we adopt a Kullback-Leibler divergence based sparsity measurement for each query to decide whether we compute the attention function on this query. By using the prob-sparse attention mechanism, we achieve impressively 8% to 45% inference speed-up and 15% to 45% memory usage reduction of the self-attention module of Conformer Transducer while maintaining the same level of error rate.

Authors (4)

Xiong Wang (52 papers)
Sining Sun (17 papers)
Lei Xie (337 papers)
Long Ma (116 papers)

Citations (15)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Efficient Conformer with Prob-Sparse Attention Mechanism for End-to-EndSpeech Recognition (2106.09236v1)

Summary

Related Papers