Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Less Peaky and More Accurate CTC Forced Alignment by Label Priors (2406.02560v3)

Published 22 Apr 2024 in eess.AS, cs.AI, cs.CL, and cs.LG

Abstract: Connectionist temporal classification (CTC) models are known to have peaky output distributions. Such behavior is not a problem for automatic speech recognition (ASR), but it can cause inaccurate forced alignments (FA), especially at finer granularity, e.g., phoneme level. This paper aims at alleviating the peaky behavior for CTC and improve its suitability for forced alignment generation, by leveraging label priors, so that the scores of alignment paths containing fewer blanks are boosted and maximized during training. As a result, our CTC model produces less peaky posteriors and is able to more accurately predict the offset of the tokens besides their onset. It outperforms the standard CTC model and a heuristics-based approach for obtaining CTC's token offset timestamps by 12-40% in phoneme and word boundary errors (PBE and WBE) measured on the Buckeye and TIMIT data. Compared with the most widely used FA toolkit Montreal Forced Aligner (MFA), our method performs similarly on PBE/WBE on Buckeye, yet falls behind MFA on TIMIT. Nevertheless, our method has a much simpler training pipeline and better runtime efficiency. Our training recipe and pretrained model are released in TorchAudio.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (12)
  1. Ruizhe Huang (5 papers)
  2. Xiaohui Zhang (105 papers)
  3. Zhaoheng Ni (32 papers)
  4. Li Sun (135 papers)
  5. Moto Hira (6 papers)
  6. Jeff Hwang (5 papers)
  7. Vimal Manohar (15 papers)
  8. Vineel Pratap (18 papers)
  9. Matthew Wiesner (32 papers)
  10. Shinji Watanabe (416 papers)
  11. Daniel Povey (45 papers)
  12. Sanjeev Khudanpur (74 papers)
Citations (3)