Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 52 tok/s
Gemini 2.5 Pro 55 tok/s Pro
GPT-5 Medium 25 tok/s Pro
GPT-5 High 26 tok/s Pro
GPT-4o 107 tok/s Pro
Kimi K2 216 tok/s Pro
GPT OSS 120B 468 tok/s Pro
Claude Sonnet 4 39 tok/s Pro
2000 character limit reached

Structure-informed Positional Encoding for Music Generation (2402.13301v2)

Published 20 Feb 2024 in cs.SD, cs.AI, and eess.AS

Abstract: Music generated by deep learning methods often suffers from a lack of coherence and long-term organization. Yet, multi-scale hierarchical structure is a distinctive feature of music signals. To leverage this information, we propose a structure-informed positional encoding framework for music generation with Transformers. We design three variants in terms of absolute, relative and non-stationary positional information. We comprehensively test them on two symbolic music generation tasks: next-timestep prediction and accompaniment generation. As a comparison, we choose multiple baselines from the literature and demonstrate the merits of our methods using several musically-motivated evaluation metrics. In particular, our methods improve the melodic and structural consistency of the generated pieces.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (26)
  1. Louis Bigo, Modeling Musical Scores Languages, HDR dissertation, Université de Lille, 2023.
  2. “Attention Is All You Need,” NeurIPS, 2017.
  3. “Pop Music Transformer: Beat-based Modeling and Generation of Expressive Pop Piano Compositions,” ACM International Conference on Multimedia (MM), 2020.
  4. “PopMAG: Pop Music Accompaniment Generation,” MM, 2020.
  5. “A Survey on Deep Learning for Symbolic Music Generation: Representations, Algorithms, Evaluations, and Challenges,” ACM Comput. Surv., vol. 56, no. 1, 2023.
  6. “The Jazz Transformer on the Front Line: Exploring the Shortcomings of AI-composed Music through Quantitative Measures,” ISMIR, 2020.
  7. “Compound Word Transformer: Learning to Compose Full-Song Music over Dynamic Directed Hypergraphs,” AAAI, 2021.
  8. “How Large a Vocabulary Does Text Classification Need? A Variational Approach to Vocabulary Selection,” NAACL, 2019.
  9. “Allocating Large Vocabulary Capacity for Cross-Lingual Language Model Pre-Training,” NAACL, 2021.
  10. “Which transformer architecture fits my data? A vocabulary bottleneck in self-attention,” ICML, 2021.
  11. “Symphony Generation with Permutation Invariant Language Model,” ISMIR, 2022.
  12. “A Domain-Knowledge-Inspired Music Embedding Space and a Novel Attention Mechanism for Symbolic Music Modeling,” AAAI, 2023.
  13. “POP909: A Pop-song Dataset for Music Arrangement Generation,” ISMIR, 2020.
  14. “Self-Attention with Relative Position Representations,” ACL, 2018.
  15. “Efficient Transformers: A Survey,” ACM Comput. Surv., 2022.
  16. “Relative Positional Encoding for Transformers with Linear Complexity,” ICML, 2021.
  17. “Learning a Fourier Transform for Linear Relative Positional Encodings in Transformers,” 2023, preprint arXiv:2302.01925.
  18. Marc G Genton, “Classes of Kernels for Machine Learning: A Statistics Perspective,” JMLR, vol. 2, 2001.
  19. “Curriculum learning,” ICML, 2009.
  20. “Automatic Analysis and Influence of Hierarchical Structure on Melody, Rhythm and Harmony in Popular Music,” Joint Conference on AI Music Creativity (AIMC), 2020.
  21. “Music Transformer: Generating Music with Long-Term Structure,” ICLR, 2019.
  22. “MuseMorphose: Full-Song and Fine-Grained Music Style Transfer with Just One Transformer VAE,” TASLP, 2022.
  23. “Real-time drum accompaniment using transformer architecture,” AIMC, 2022.
  24. “Transformer Dissection: An Unified Understanding for Transformer’s Attention via the Lens of Kernel,” EMNLP-IJCNLP, 2019.
  25. “Transformer Language Models without Positional Encodings Still Learn Positional Information,” EMNLP, 2022.
  26. “The Impact of Positional Encoding on Length Generalization in Transformers,” 2023, preprint arXiv:2305.19466.
List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-Up Questions

We haven't generated follow-up questions for this paper yet.