Papers

Topics

Authors

Recent

View all

Assistant

AI Research Assistant

Well-researched responses based on relevant abstracts and paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses.

Gemini 2.5 Flash

Gemini 2.5 Flash 86 tok/s

Gemini 2.5 Pro 48 tok/s Pro

GPT-5 Medium 23 tok/s Pro

GPT-5 High 22 tok/s Pro

GPT-4o 73 tok/s Pro

Kimi K2 206 tok/s Pro

GPT OSS 120B 469 tok/s Pro

Claude Sonnet 4 36 tok/s Pro

2000 character limit reached

Correction of Automatic Speech Recognition with Transformer Sequence-to-sequence Model (1910.10697v1)

Published 23 Oct 2019 in cs.CL, cs.SD, and eess.AS

Abstract: In this work, we introduce a simple yet efficient post-processing model for automatic speech recognition (ASR). Our model has Transformer-based encoder-decoder architecture which "translates" ASR model output into grammatically and semantically correct text. We investigate different strategies for regularizing and optimizing the model and show that extensive data augmentation and the initialization with pre-trained weights are required to achieve good performance. On the LibriSpeech benchmark, our method demonstrates significant improvement in word error rate over the baseline acoustic model with greedy decoding, especially on much noisier dev-other and test-other portions of the evaluation dataset. Our model also outperforms baseline with 6-gram LLM re-scoring and approaches the performance of re-scoring with Transformer-XL neural LLM.

Citations (83)

View on Semantic Scholar