Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
60 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Leveraging Large Language Models for Exploiting ASR Uncertainty (2309.04842v2)

Published 9 Sep 2023 in cs.CL, cs.HC, cs.SD, and eess.AS

Abstract: While LLMs excel in a variety of NLP tasks, to perform well on spoken language understanding (SLU) tasks, they must either rely on off-the-shelf automatic speech recognition (ASR) systems for transcription, or be equipped with an in-built speech modality. This work focuses on the former scenario, where LLM's accuracy on SLU tasks is constrained by the accuracy of a fixed ASR system on the spoken input. Specifically, we tackle speech-intent classification task, where a high word-error-rate can limit the LLM's ability to understand the spoken intent. Instead of chasing a high accuracy by designing complex or specialized architectures regardless of deployment costs, we seek to answer how far we can go without substantially changing the underlying ASR and LLM, which can potentially be shared by multiple unrelated tasks. To this end, we propose prompting the LLM with an n-best list of ASR hypotheses instead of only the error-prone 1-best hypothesis. We explore prompt-engineering to explain the concept of n-best lists to the LLM; followed by the finetuning of Low-Rank Adapters on the downstream tasks. Our approach using n-best lists proves to be effective on a device-directed speech detection task as well as on a keyword spotting task, where systems using n-best list prompts outperform those using 1-best ASR hypothesis; thus paving the way for an efficient method to exploit ASR uncertainty via LLMs for speech-based applications.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Pranay Dighe (14 papers)
  2. Yi Su (70 papers)
  3. Shangshang Zheng (2 papers)
  4. Yunshu Liu (5 papers)
  5. Vineet Garg (11 papers)
  6. Xiaochuan Niu (5 papers)
  7. Ahmed Tewfik (25 papers)
Citations (11)
X Twitter Logo Streamline Icon: https://streamlinehq.com