Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Refined WaveNet Vocoder for Variational Autoencoder Based Voice Conversion (1811.11078v2)

Published 27 Nov 2018 in eess.AS, cs.CL, and cs.SD

Abstract: This paper presents a refinement framework of WaveNet vocoders for variational autoencoder (VAE) based voice conversion (VC), which reduces the quality distortion caused by the mismatch between the training data and testing data. Conventional WaveNet vocoders are trained with natural acoustic features but conditioned on the converted features in the conversion stage for VC, and such a mismatch often causes significant quality and similarity degradation. In this work, we take advantage of the particular structure of VAEs to refine WaveNet vocoders with the self-reconstructed features generated by VAE, which are of similar characteristics with the converted features while having the same temporal structure with the target natural features. We analyze these features and show that the self-reconstructed features are similar to the converted features. Objective and subjective experimental results demonstrate the effectiveness of our proposed framework.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (9)
  1. Wen-Chin Huang (53 papers)
  2. Yi-Chiao Wu (42 papers)
  3. Hsin-Te Hwang (9 papers)
  4. Patrick Lumban Tobing (20 papers)
  5. Tomoki Hayashi (42 papers)
  6. Kazuhiro Kobayashi (19 papers)
  7. Tomoki Toda (106 papers)
  8. Yu Tsao (200 papers)
  9. Hsin-Min Wang (97 papers)
Citations (20)

Summary

We haven't generated a summary for this paper yet.