Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

NDVQ: Robust Neural Audio Codec with Normal Distribution-Based Vector Quantization (2409.12717v1)

Published 19 Sep 2024 in eess.AS and cs.SD

Abstract: Built upon vector quantization (VQ), discrete audio codec models have achieved great success in audio compression and auto-regressive audio generation. However, existing models face substantial challenges in perceptual quality and signal distortion, especially when operating in extremely low bandwidth, rooted in the sensitivity of the VQ codebook to noise. This degradation poses significant challenges for several downstream tasks, such as codec-based speech synthesis. To address this issue, we propose a novel VQ method, Normal Distribution-based Vector Quantization (NDVQ), by introducing an explicit margin between the VQ codes via learning a variance. Specifically, our approach involves mapping the waveform to a latent space and quantizing it by selecting the most likely normal distribution, with each codebook entry representing a unique normal distribution defined by its mean and variance. Using these distribution-based VQ codec codes, a decoder reconstructs the input waveform. NDVQ is trained with additional distribution-related losses, alongside reconstruction and discrimination losses. Experiments demonstrate that NDVQ outperforms existing audio compression baselines, such as EnCodec, in terms of audio quality and zero-shot TTS, particularly in very low bandwidth scenarios.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Zhikang Niu (11 papers)
  2. Sanyuan Chen (28 papers)
  3. Long Zhou (57 papers)
  4. Ziyang Ma (73 papers)
  5. Xie Chen (166 papers)
  6. Shujie Liu (101 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets