Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

On the Use of Semantically-Aligned Speech Representations for Spoken Language Understanding (2210.05291v1)

Published 11 Oct 2022 in cs.CL, cs.SD, and eess.AS

Abstract: In this paper we examine the use of semantically-aligned speech representations for end-to-end spoken language understanding (SLU). We employ the recently-introduced SAMU-XLSR model, which is designed to generate a single embedding that captures the semantics at the utterance level, semantically aligned across different languages. This model combines the acoustic frame-level speech representation learning model (XLS-R) with the Language Agnostic BERT Sentence Embedding (LaBSE) model. We show that the use of the SAMU-XLSR model instead of the initial XLS-R model improves significantly the performance in the framework of end-to-end SLU. Finally, we present the benefits of using this model towards language portability in SLU.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Valentin Pelloin (5 papers)
  2. Themos Stafylakis (35 papers)
  3. Yannick Estève (45 papers)
  4. Gaëlle Laperrière (4 papers)
  5. Mickaël Rouvier (1 paper)
Citations (9)

Summary

We haven't generated a summary for this paper yet.