Structure-informed Positional Encoding for Music Generation
Abstract: Music generated by deep learning methods often suffers from a lack of coherence and long-term organization. Yet, multi-scale hierarchical structure is a distinctive feature of music signals. To leverage this information, we propose a structure-informed positional encoding framework for music generation with Transformers. We design three variants in terms of absolute, relative and non-stationary positional information. We comprehensively test them on two symbolic music generation tasks: next-timestep prediction and accompaniment generation. As a comparison, we choose multiple baselines from the literature and demonstrate the merits of our methods using several musically-motivated evaluation metrics. In particular, our methods improve the melodic and structural consistency of the generated pieces.
- Louis Bigo, Modeling Musical Scores Languages, HDR dissertation, UniversitƩ de Lille, 2023.
- āAttention Is All You Need,ā NeurIPS, 2017.
- āPop Music Transformer: Beat-based Modeling and Generation of Expressive Pop Piano Compositions,ā ACM International Conference on Multimedia (MM), 2020.
- āPopMAG: Pop Music Accompaniment Generation,ā MM, 2020.
- āA Survey on Deep Learning for Symbolic Music Generation: Representations, Algorithms, Evaluations, and Challenges,ā ACM Comput. Surv., vol. 56, no. 1, 2023.
- āThe Jazz Transformer on the Front Line: Exploring the Shortcomings of AI-composed Music through Quantitative Measures,ā ISMIR, 2020.
- āCompound Word Transformer: Learning to Compose Full-Song Music over Dynamic Directed Hypergraphs,ā AAAI, 2021.
- āHow Large a Vocabulary Does Text Classification Need? A Variational Approach to Vocabulary Selection,ā NAACL, 2019.
- āAllocating Large Vocabulary Capacity for Cross-Lingual Language Model Pre-Training,ā NAACL, 2021.
- āWhich transformer architecture fits my data? A vocabulary bottleneck in self-attention,ā ICML, 2021.
- āSymphony Generation with Permutation Invariant Language Model,ā ISMIR, 2022.
- āA Domain-Knowledge-Inspired Music Embedding Space and a Novel Attention Mechanism for Symbolic Music Modeling,ā AAAI, 2023.
- āPOP909: A Pop-song Dataset for Music Arrangement Generation,ā ISMIR, 2020.
- āSelf-Attention with Relative Position Representations,ā ACL, 2018.
- āEfficient Transformers: A Survey,ā ACM Comput. Surv., 2022.
- āRelative Positional Encoding for Transformers with Linear Complexity,ā ICML, 2021.
- āLearning a Fourier Transform for Linear Relative Positional Encodings in Transformers,ā 2023, preprint arXiv:2302.01925.
- MarcĀ G Genton, āClasses of Kernels for Machine Learning: A Statistics Perspective,ā JMLR, vol. 2, 2001.
- āCurriculum learning,ā ICML, 2009.
- āAutomatic Analysis and Inļ¬uence of Hierarchical Structure on Melody, Rhythm and Harmony in Popular Music,ā Joint Conference on AI Music Creativity (AIMC), 2020.
- āMusic Transformer: Generating Music with Long-Term Structure,ā ICLR, 2019.
- āMuseMorphose: Full-Song and Fine-Grained Music Style Transfer with Just One Transformer VAE,ā TASLP, 2022.
- āReal-time drum accompaniment using transformer architecture,ā AIMC, 2022.
- āTransformer Dissection: An Unified Understanding for Transformerās Attention via the Lens of Kernel,ā EMNLP-IJCNLP, 2019.
- āTransformer Language Models without Positional Encodings Still Learn Positional Information,ā EMNLP, 2022.
- āThe Impact of Positional Encoding on Length Generalization in Transformers,ā 2023, preprint arXiv:2305.19466.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.