Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

PostGAN: A GAN-Based Post-Processor to Enhance the Quality of Coded Speech (2201.13093v1)

Published 31 Jan 2022 in eess.AS and cs.SD

Abstract: The quality of speech coded by transform coding is affected by various artefacts especially when bitrates to quantize the frequency components become too low. In order to mitigate these coding artefacts and enhance the quality of coded speech, a post-processor that relies on a-priori information transmitted from the encoder is traditionally employed at the decoder side. In recent years, several data-driven post-postprocessors have been proposed which were shown to outperform traditional approaches. In this paper, we propose PostGAN, a GAN-based neural post-processor that operates in the sub-band domain and relies on the U-Net architecture and a learned affine transform. It has been tested on the recently standardized low-complexity, low-delay bluetooth codec (LC3) for wideband speech at the lowest bitrate (16 kbit/s). Subjective evaluations and objective scores show that the newly introduced post-processor surpasses previously published methods and can improve the quality of coded speech by around 20 MUSHRA points.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Srikanth Korse (7 papers)
  2. Nicola Pia (15 papers)
  3. Kishan Gupta (8 papers)
  4. Guillaume Fuchs (11 papers)
Citations (13)

Summary

We haven't generated a summary for this paper yet.