Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Recent Advances in End-to-End Simultaneous Speech Translation (2406.00497v2)

Published 1 Jun 2024 in cs.SD, cs.AI, cs.CL, and eess.AS

Abstract: Simultaneous speech translation (SimulST) is a demanding task that involves generating translations in real-time while continuously processing speech input. This paper offers a comprehensive overview of the recent developments in SimulST research, focusing on four major challenges. Firstly, the complexities associated with processing lengthy and continuous speech streams pose significant hurdles. Secondly, satisfying real-time requirements presents inherent difficulties due to the need for immediate translation output. Thirdly, striking a balance between translation quality and latency constraints remains a critical challenge. Finally, the scarcity of annotated data adds another layer of complexity to the task. Through our exploration of these challenges and the proposed solutions, we aim to provide valuable insights into the current landscape of SimulST research and suggest promising directions for future exploration.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Xiaoqian Liu (24 papers)
  2. Guoqiang Hu (47 papers)
  3. Yangfan Du (2 papers)
  4. Erfeng He (2 papers)
  5. Chen Xu (186 papers)
  6. Tong Xiao (119 papers)
  7. Jingbo Zhu (79 papers)
  8. Yingfeng Luo (9 papers)
Citations (1)
X Twitter Logo Streamline Icon: https://streamlinehq.com