Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

The AS-NU System for the M2VoC Challenge (2104.03009v1)

Published 7 Apr 2021 in eess.AS, cs.LG, and cs.SD

Abstract: This paper describes the AS-NU systems for two tracks in MultiSpeaker Multi-Style Voice Cloning Challenge (M2VoC). The first track focuses on using a small number of 100 target utterances for voice cloning, while the second track focuses on using only 5 target utterances for voice cloning. Due to the serious lack of data in the second track, we selected the speaker most similar to the target speaker from the training data of the TTS system, and used the speaker's utterances and the given 5 target utterances to fine-tune our model. The evaluation results show that our systems on the two tracks perform similarly in terms of quality, but there is still a clear gap between the similarity score of the second track and the similarity score of the first track.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (9)
  1. Cheng-Hung Hu (6 papers)
  2. Yi-Chiao Wu (42 papers)
  3. Wen-Chin Huang (53 papers)
  4. Yu-Huai Peng (13 papers)
  5. Yu-Wen Chen (16 papers)
  6. Pin-Jui Ku (7 papers)
  7. Tomoki Toda (106 papers)
  8. Yu Tsao (200 papers)
  9. Hsin-Min Wang (97 papers)
Citations (1)