Large Language Models: From Notes to Musical Form

Published 18 Apr 2024 in cs.SD and eess.AS | (2404.11976v1)

Abstract: While many topics of the learning-based approach to automated music generation are under active research, musical form is under-researched. In particular, recent methods based on deep learning models generate music that, at the largest time scale, lacks any structure. In practice, music longer than one minute generated by such models is either unpleasantly repetitive or directionless. Adapting a recent music generation model, this paper proposes a novel method to generate music with form. The experimental results show that the proposed method can generate 2.5-minute-long music that is considered as pleasant as the music used to train the model. The paper first reviews a recent music generation method based on LLMs (transformer architecture). We discuss why learning musical form by such models is infeasible. Then we discuss our proposed method and the experiments.