Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

ASR-Generated Text for Language Model Pre-training Applied to Speech Tasks (2207.01893v1)

Published 5 Jul 2022 in cs.CL

Abstract: We aim at improving spoken LLMing (LM) using very large amount of automatically transcribed speech. We leverage the INA (French National Audiovisual Institute) collection and obtain 19GB of text after applying ASR on 350,000 hours of diverse TV shows. From this, spoken LLMs are trained either by fine-tuning an existing LM (FlauBERT) or through training a LM from scratch. New models (FlauBERT-Oral) are shared with the community and evaluated for 3 downstream tasks: spoken language understanding, classification of TV shows and speech syntactic parsing. Results show that FlauBERT-Oral can be beneficial compared to its initial FlauBERT version demonstrating that, despite its inherent noisy nature, ASR-generated text can be used to build spoken LLMs.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Valentin Pelloin (5 papers)
  2. Franck Dary (2 papers)
  3. Nicolas Herve (26 papers)
  4. Benoit Favre (9 papers)
  5. Nathalie Camelin (4 papers)
  6. Antoine Laurent (22 papers)
  7. Laurent Besacier (76 papers)
Citations (5)

Summary

We haven't generated a summary for this paper yet.