Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
84 tokens/sec
Gemini 2.5 Pro Premium
49 tokens/sec
GPT-5 Medium
16 tokens/sec
GPT-5 High Premium
19 tokens/sec
GPT-4o
97 tokens/sec
DeepSeek R1 via Azure Premium
77 tokens/sec
GPT OSS 120B via Groq Premium
476 tokens/sec
Kimi K2 via Groq Premium
234 tokens/sec
2000 character limit reached

SpectroStream: A Versatile Neural Codec for General Audio (2508.05207v1)

Published 7 Aug 2025 in cs.SD, cs.AI, and eess.AS

Abstract: We propose SpectroStream, a full-band multi-channel neural audio codec. Successor to the well-established SoundStream, SpectroStream extends its capability beyond 24 kHz monophonic audio and enables high-quality reconstruction of 48 kHz stereo music at bit rates of 4--16 kbps. This is accomplished with a new neural architecture that leverages audio representation in the time-frequency domain, which leads to better audio quality especially at higher sample rate. The model also uses a delayed-fusion strategy to handle multi-channel audio, which is crucial in balancing per-channel acoustic quality and cross-channel phase consistency.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube