Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

JEN-1: Text-Guided Universal Music Generation with Omnidirectional Diffusion Models (2308.04729v1)

Published 9 Aug 2023 in cs.SD, cs.AI, cs.LG, cs.MM, and eess.AS

Abstract: Music generation has attracted growing interest with the advancement of deep generative models. However, generating music conditioned on textual descriptions, known as text-to-music, remains challenging due to the complexity of musical structures and high sampling rate requirements. Despite the task's significance, prevailing generative models exhibit limitations in music quality, computational efficiency, and generalization. This paper introduces JEN-1, a universal high-fidelity model for text-to-music generation. JEN-1 is a diffusion model incorporating both autoregressive and non-autoregressive training. Through in-context learning, JEN-1 performs various generation tasks including text-guided music generation, music inpainting, and continuation. Evaluations demonstrate JEN-1's superior performance over state-of-the-art methods in text-music alignment and music quality while maintaining computational efficiency. Our demos are available at http://futureverse.com/research/jen/demos/jen1

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Peike Li (13 papers)
  2. Boyu Chen (30 papers)
  3. Yao Yao (235 papers)
  4. Yikai Wang (78 papers)
  5. Allen Wang (7 papers)
  6. Alex Wang (32 papers)
Citations (29)
X Twitter Logo Streamline Icon: https://streamlinehq.com