Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Large-scale Language Model Rescoring on Long-form Data (2306.08133v2)

Published 13 Jun 2023 in eess.AS and cs.CL

Abstract: In this work, we study the impact of Large-scale LLMs (LLM) on Automated Speech Recognition (ASR) of YouTube videos, which we use as a source for long-form ASR. We demonstrate up to 8\% relative reduction in Word Error Eate (WER) on US English (en-us) and code-switched Indian English (en-in) long-form ASR test sets and a reduction of up to 30\% relative on Salient Term Error Rate (STER) over a strong first-pass baseline that uses a maximum-entropy based LLM. Improved lattice processing that results in a lattice with a proper (non-tree) digraph topology and carrying context from the 1-best hypothesis of the previous segment(s) results in significant wins in rescoring with LLMs. We also find that the gains in performance from the combination of LLMs trained on vast quantities of available data (such as C4) and conventional neural LMs is additive and significantly outperforms a strong first-pass baseline with a maximum entropy LM. Copyright 2023 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (11)
  1. Tongzhou Chen (7 papers)
  2. Cyril Allauzen (13 papers)
  3. Yinghui Huang (13 papers)
  4. Daniel Park (10 papers)
  5. David Rybach (19 papers)
  6. W. Ronny Huang (25 papers)
  7. Rodrigo Cabrera (3 papers)
  8. Kartik Audhkhasi (22 papers)
  9. Bhuvana Ramabhadran (47 papers)
  10. Pedro J. Moreno (8 papers)
  11. Michael Riley (16 papers)
Citations (13)