Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Transformers in Speech Processing: A Survey (2303.11607v1)

Published 21 Mar 2023 in cs.CL, cs.SD, and eess.AS

Abstract: The remarkable success of transformers in the field of natural language processing has sparked the interest of the speech-processing community, leading to an exploration of their potential for modeling long-range dependencies within speech sequences. Recently, transformers have gained prominence across various speech-related domains, including automatic speech recognition, speech synthesis, speech translation, speech para-linguistics, speech enhancement, spoken dialogue systems, and numerous multimodal applications. In this paper, we present a comprehensive survey that aims to bridge research studies from diverse subfields within speech technology. By consolidating findings from across the speech technology landscape, we provide a valuable resource for researchers interested in harnessing the power of transformers to advance the field. We identify the challenges encountered by transformers in speech processing while also offering insights into potential solutions to address these issues.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Siddique Latif (38 papers)
  2. Aun Zaidi (3 papers)
  3. Fahad Shamshad (21 papers)
  4. Moazzam Shoukat (2 papers)
  5. Junaid Qadir (110 papers)
  6. Heriberto Cuayahuitl (15 papers)
Citations (41)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com