Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
84 tokens/sec
Gemini 2.5 Pro Premium
49 tokens/sec
GPT-5 Medium
16 tokens/sec
GPT-5 High Premium
19 tokens/sec
GPT-4o
97 tokens/sec
DeepSeek R1 via Azure Premium
77 tokens/sec
GPT OSS 120B via Groq Premium
476 tokens/sec
Kimi K2 via Groq Premium
234 tokens/sec
2000 character limit reached

Speech Enhancement Using Continuous Embeddings of Neural Audio Codec (2502.16240v1)

Published 22 Feb 2025 in eess.AS, cs.AI, cs.LG, and cs.SD

Abstract: Recent advancements in Neural Audio Codec (NAC) models have inspired their use in various speech processing tasks, including speech enhancement (SE). In this work, we propose a novel, efficient SE approach by leveraging the pre-quantization output of a pretrained NAC encoder. Unlike prior NAC-based SE methods, which process discrete speech tokens using LLMs (LMs), we perform SE within the continuous embedding space of the pretrained NAC, which is highly compressed along the time dimension for efficient representation. Our lightweight SE model, optimized through an embedding-level loss, delivers results comparable to SE baselines trained on larger datasets, with a significantly lower real-time factor of 0.005. Additionally, our method achieves a low GMAC of 3.94, reducing complexity 18-fold compared to Sepformer in a simulated cloud-based audio transmission environment. This work highlights a new, efficient NAC-based SE solution, particularly suitable for cloud applications where NAC is used to compress audio before transmission. Copyright 20XX IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.