Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

You don't understand me!: Comparing ASR results for L1 and L2 speakers of Swedish (2405.13379v1)

Published 22 May 2024 in cs.CL, cs.SD, and eess.AS

Abstract: The performance of Automatic Speech Recognition (ASR) systems has constantly increased in state-of-the-art development. However, performance tends to decrease considerably in more challenging conditions (e.g., background noise, multiple speaker social conversations) and with more atypical speakers (e.g., children, non-native speakers or people with speech disorders), which signifies that general improvements do not necessarily transfer to applications that rely on ASR, e.g., educational software for younger students or language learners. In this study, we focus on the gap in performance between recognition results for native and non-native, read and spontaneous, Swedish utterances transcribed by different ASR services. We compare the recognition results using Word Error Rate and analyze the linguistic factors that may generate the observed transcription errors.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Ronald Cumbal (2 papers)
  2. Olof Engwall (1 paper)
  3. Birger Moell (10 papers)
  4. Jose Lopes (1 paper)
Citations (20)

Summary

We haven't generated a summary for this paper yet.