APCodec: A Neural Audio Codec with Parallel Amplitude and Phase Spectrum Encoding and Decoding (2402.10533v2)

Published 16 Feb 2024 in cs.SD and eess.AS

Abstract: This paper introduces a novel neural audio codec targeting high waveform sampling rates and low bitrates named APCodec, which seamlessly integrates the strengths of parametric codecs and waveform codecs. The APCodec revolutionizes the process of audio encoding and decoding by concurrently handling the amplitude and phase spectra as audio parametric characteristics like parametric codecs. It is composed of an encoder and a decoder with the modified ConvNeXt v2 network as the backbone, connected by a quantizer based on the residual vector quantization (RVQ) mechanism. The encoder compresses the audio amplitude and phase spectra in parallel, amalgamating them into a continuous latent code at a reduced temporal resolution. This code is subsequently quantized by the quantizer. Ultimately, the decoder reconstructs the audio amplitude and phase spectra in parallel, and the decoded waveform is obtained by inverse short-time Fourier transform. To ensure the fidelity of decoded audio like waveform codecs, spectral-level loss, quantization loss, and generative adversarial network (GAN) based loss are collectively employed for training the APCodec. To support low-latency streamable inference, we employ feed-forward layers and causal deconvolutional layers in APCodec, incorporating a knowledge distillation training strategy to enhance the quality of decoded audio. Experimental results confirm that our proposed APCodec can encode 48 kHz audio at bitrate of just 6 kbps, with no significant degradation in the quality of the decoded audio. At the same bitrate, our proposed APCodec also demonstrates superior decoded audio quality and faster generation speed compared to well-known codecs, such as Encodec, AudioDec and DAC.

PDF HTML Abstract

Summarize Bookmark Chat (Pro)

References (48)

Authors (5)

Yang Ai (41 papers)
Xiao-Hang Jiang (6 papers)
Ye-Xin Lu (17 papers)
Hui-Peng Du (15 papers)
Zhen-Hua Ling (114 papers)

Citations (12)

View on Semantic Scholar

APCodec: A Neural Audio Codec with Parallel Amplitude and Phase Spectrum Encoding and Decoding (2402.10533v2)

Related Papers