Papers
Topics
Authors
Recent
2000 character limit reached

Input-independent Attention Weights Are Expressive Enough: A Study of Attention in Self-supervised Audio Transformers

Published 9 Jun 2020 in eess.AS, cs.CL, and cs.SD | (2006.05174v2)

Abstract: In this paper, we seek solutions for reducing the computation complexity of transformer-based models for speech representation learning. We evaluate 10 attention algorithms; then, we pre-train the transformer-based model with those attention algorithms in a self-supervised fashion and treat them as feature extractors on downstream tasks, including phoneme classification and speaker classification. With the assistance of t-SNE, PCA and some observation, the attention weights in self-supervised audio transformers can be categorized into four general cases. Based on these cases and some analyses, we are able to use a specific set of attention weights to initialize the model. Our approach shows comparable performance to the typical self-attention yet requires 20% less time in both training and inference.

Citations (1)

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.