Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A Survey on Recent Deep Learning-driven Singing Voice Synthesis Systems (2110.02511v1)

Published 6 Oct 2021 in eess.AS

Abstract: Singing voice synthesis (SVS) is a task that aims to generate audio signals according to musical scores and lyrics. With its multifaceted nature concerning music and language, producing singing voices indistinguishable from that of human singers has always remained an unfulfilled pursuit. Nonetheless, the advancements of deep learning techniques have brought about a substantial leap in the quality and naturalness of synthesized singing voice. This paper aims to review some of the state-of-the-art deep learning-driven SVS systems. We intend to summarize their deployed model architectures and identify the strengths and limitations for each of the introduced systems. Thereby, we picture the recent advancement trajectory of this field and conclude the challenges left to be resolved both in commercial applications and academic research.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Yin-Ping Cho (2 papers)
  2. Fu-Rong Yang (2 papers)
  3. Yung-Chuan Chang (2 papers)
  4. Ching-Ting Cheng (1 paper)
  5. Xiao-Han Wang (3 papers)
  6. Yi-Wen Liu (29 papers)
Citations (17)

Summary

We haven't generated a summary for this paper yet.