Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

DRED: Deep REDundancy Coding of Speech Using a Rate-Distortion-Optimized Variational Autoencoder (2212.04453v3)

Published 8 Dec 2022 in eess.AS

Abstract: Despite recent advancements in packet loss concealment (PLC) using deep learning techniques, packet loss remains a significant challenge in real-time speech communication. Redundancy has been used in the past to recover the missing information during losses. However, conventional redundancy techniques are limited in the maximum loss duration they can cover and are often unsuitable for burst packet loss. We propose a new approach based on a rate-distortion-optimized variational autoencoder (RDO-VAE), allowing us to optimize a deep speech compression algorithm for the task of encoding large amounts of redundancy at very low bitrate. The proposed Deep REDundancy (DRED) algorithm can transmit up to 50x redundancy using less than 32 kb/s. Results show that DRED outperforms the existing Opus codec redundancy. We also demonstrate its benefits when operating in the context of WebRTC.

Citations (5)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com