R-BI: Regularized Batched Inputs enhance Incremental Decoding Framework for Low-Latency Simultaneous Speech Translation (2401.05700v1)

Published 11 Jan 2024 in cs.CL and cs.AI

Abstract: Incremental Decoding is an effective framework that enables the use of an offline model in a simultaneous setting without modifying the original model, making it suitable for Low-Latency Simultaneous Speech Translation. However, this framework may introduce errors when the system outputs from incomplete input. To reduce these output errors, several strategies such as Hold-$n$, LA-$n$, and SP-$n$ can be employed, but the hyper-parameter $n$ needs to be carefully selected for optimal performance. Moreover, these strategies are more suitable for end-to-end systems than cascade systems. In our paper, we propose a new adaptable and efficient policy named "Regularized Batched Inputs". Our method stands out by enhancing input diversity to mitigate output errors. We suggest particular regularization techniques for both end-to-end and cascade systems. We conducted experiments on IWSLT Simultaneous Speech Translation (SimulST) tasks, which demonstrate that our approach achieves low latency while maintaining no more than 2 BLEU points loss compared to offline systems. Furthermore, our SimulST systems attained several new state-of-the-art results in various language directions.

PDF HTML Abstract

Summarize Bookmark Chat (Pro)

References (45)

Authors (9)

Jiaxin Guo (40 papers)
Zhanglin Wu (19 papers)
Zongyao Li (23 papers)
Hengchao Shang (22 papers)
Daimeng Wei (31 papers)
Xiaoyu Chen (126 papers)
Zhiqiang Rao (12 papers)
Shaojun Li (13 papers)
Hao Yang (328 papers)

R-BI: Regularized Batched Inputs enhance Incremental Decoding Framework for Low-Latency Simultaneous Speech Translation (2401.05700v1)

Related Papers

Tweets