Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

VRDMG: Vocal Restoration via Diffusion Posterior Sampling with Multiple Guidance (2309.06934v1)

Published 13 Sep 2023 in eess.AS and cs.SD

Abstract: Restoring degraded music signals is essential to enhance audio quality for downstream music manipulation. Recent diffusion-based music restoration methods have demonstrated impressive performance, and among them, diffusion posterior sampling (DPS) stands out given its intrinsic properties, making it versatile across various restoration tasks. In this paper, we identify that there are potential issues which will degrade current DPS-based methods' performance and introduce the way to mitigate the issues inspired by diverse diffusion guidance techniques including the RePaint (RP) strategy and the Pseudoinverse-Guided Diffusion Models ($\Pi$GDM). We demonstrate our methods for the vocal declipping and bandwidth extension tasks under various levels of distortion and cutoff frequency, respectively. In both tasks, our methods outperform the current DPS-based music restoration benchmarks. We refer to \url{http://carlosholivan.github.io/demos/audio-restoration-2023.html} for examples of the restored audio samples.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Carlos Hernandez-Olivan (10 papers)
  2. Koichi Saito (33 papers)
  3. Naoki Murata (29 papers)
  4. Chieh-Hsin Lai (32 papers)
  5. Wei-Hsiang Liao (33 papers)
  6. Yuki Mitsufuji (127 papers)
  7. Marco A. Martínez-Ramirez (1 paper)
Citations (7)

Summary

We haven't generated a summary for this paper yet.