Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A Survey of Music Generation in the Context of Interaction (2402.15294v1)

Published 23 Feb 2024 in cs.SD, cs.AI, cs.LG, and eess.AS

Abstract: In recent years, machine learning, and in particular generative adversarial neural networks (GANs) and attention-based neural networks (transformers), have been successfully used to compose and generate music, both melodies and polyphonic pieces. Current research focuses foremost on style replication (eg. generating a Bach-style chorale) or style transfer (eg. classical to jazz) based on large amounts of recorded or transcribed music, which in turn also allows for fairly straight-forward "performance" evaluation. However, most of these models are not suitable for human-machine co-creation through live interaction, neither is clear, how such models and resulting creations would be evaluated. This article presents a thorough review of music representation, feature analysis, heuristic algorithms, statistical and parametric modelling, and human and automatic evaluation measures, along with a discussion of which approaches and models seem most suitable for live interaction.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (131)
  1. “Lstm based music generation with dataset preprocessing and reconstruction techniques” In 2018 IEEE Symposium Series on Computational Intelligence (SSCI), 2018, pp. 455–462 IEEE
  2. Kat Agres, Jamie Forth and Geraint A. Wiggins “Evaluation of Musical Creativity and Musical Metacreation Systems” In Computers in Entertainment 14.3, 2016, pp. 3:1–3:33 DOI: 10.1145/2967506
  3. “Music Generation and Transformation with Moment Matching-Scattering Inverse Networks.” In ISMIR, 2018, pp. 327–333
  4. Christopher Ariza “The Interrogator as Critic: The Turing Test and the Evaluation of Generative Music Systems” In Computer Music Journal 33.2, 2009, pp. 48–70 DOI: 10.1162/comj.2009.33.2.48
  5. Dzmitry Bahdanau, Kyunghyun Cho and Yoshua Bengio “Neural machine translation by jointly learning to align and translate” In arXiv preprint arXiv:1409.0473, 2014
  6. “Automatic music transcription: An overview” In IEEE Signal Processing Magazine 36.1 IEEE, 2018, pp. 20–30
  7. “The Million Song Dataset” In Proceedings of the 12th International Conference on Music Information Retrieval (ISMIR 2011), 2011
  8. John Biles “GenJam: A Genetic Algorithm for Generating Jazz Solos” In Proceedings of the 1994 International Computer Music Conference University of Michigan Library, 1994
  9. John Biles “GenJam: Evolution of a Jazz Improviser” In Creative Evolutionary Systems Morgan Kaufmann, 2002, pp. 165–188
  10. “Synthesis of the singing voice by performance sampling and spectral models” In IEEE signal processing magazine 24.2 IEEE, 2007, pp. 67–79
  11. “MIDI-VAE: Modeling dynamics and instrumentation of music with applications to style transfer” In arXiv preprint arXiv:1809.07600, 2018
  12. G. Burloiu “Interactive Learning of Microtiming in an Expressive Drum Machine” In The 2020 Joint Conference on AI Music Creativity, 2020
  13. L. Callender, C. Hawthorne and J. Engel “Improving Perceptual Quality of Drum Transcription with the Expanded Groove MIDI Dataset” In arXiv preprint arXiv:2004.00188, 2020
  14. “The effect of explicit structure encoding of deep neural networks for symbolic music generation” In 2019 International Workshop on Multilayer Music Representation and Processing (MMRP), 2019, pp. 77–84 IEEE
  15. “A Variant Model of TGAN for Music Generation” In Proceedings of the 2020 Asia Service Sciences and Software Engineering Conference, 2020, pp. 40–45
  16. “Generating music with a self-correcting non-chronological autoregressive model” In arXiv preprint arXiv:2008.08927, 2020
  17. “Generating Long Sequences with Sparse Transformers”, 2019 arXiv:1904.10509 [cs.LG]
  18. “Encoding musical style with transformer autoencoders” In International Conference on Machine Learning, 2020, pp. 1899–1908 PMLR
  19. “Encoding Musical Style with Transformer Autoencoders”, 2020 arXiv:1912.05537 [cs.SD]
  20. T. Christensen “The Cambridge history of Western music theory” Cambridge University Press, 2006
  21. Hang Chu, Raquel Urtasun and Sanja Fidler “Song from PI: A musically plausible network for pop music generation” In arXiv preprint arXiv:1611.03477, 2016
  22. Ching-Hua Chuan, Kat Agres and Dorien Herremans “From context to concept: exploring semantic relationships in music with word2vec” In Neural Computing and Applications 32.4 Springer, 2020, pp. 1023–1036
  23. “Empirical evaluation of gated recurrent neural networks on sequence modeling” In arXiv preprint arXiv:1412.3555, 2014
  24. “A systematic review of artificial intelligence-based music generation: Scope, applications, and future trends” In Expert Systems with Applications 209, 2022, pp. 118190 DOI: https://doi.org/10.1016/j.eswa.2022.118190
  25. S. Colton “Creativity Versus the Perception of Creativity in Computational Systems” In AAAI Spring Symposium: Creative Intelligent Systems, 2008
  26. Charles Darwin “On the Origin of Species by Means of Natural Selection, or the Preservation of Favoured Races in the Struggle for Life” John Murray, 1859
  27. “Rhythm, Chord and Melody Generation for Lead Sheets Using Recurrent Neural Networks” In Joint European Conference on Machine Learning and Knowledge Discovery in Databases, 2019, pp. 454–461 Springer
  28. “Sing: Symbol-to-instrument neural generator” In Advances in Neural Information Processing Systems, 2018, pp. 9041–9051
  29. “Jukebox: A generative model for music” In arXiv preprint arXiv:2005.00341, 2020
  30. Sander Dieleman, Aaron Oord and Karen Simonyan “The challenge of realistic music generation: modelling raw audio at scale” In Advances in Neural Information Processing Systems 31, 2018
  31. “LakhNES: Improving Multi-instrumental Music Generation with Cross-domain Pre-training” In ISMIR, 2019
  32. H. W. Dong and Y. H. Yang “Convolutional generative adversarial networks with binary neurons for polyphonic music generation” In arXiv preprint arXiv:1804.09399, 2018
  33. “Musegan: Multi-track sequential generative adversarial networks for symbolic music generation and accompaniment” In Proceedings of the AAAI Conference on Artificial Intelligence, 2018
  34. “MusPy: A Toolkit for Symbolic Music Generation” In arXiv preprint arXiv:2008.01951, 2020
  35. “MAPS-A piano database for multipitch estimation and automatic transcription of music”, 2010
  36. “Neural audio synthesis of musical notes with wavenet autoencoders” In International Conference on Machine Learning, 2017, pp. 1068–1077 PMLR
  37. “Gansynth: Adversarial neural audio synthesis” In arXiv preprint arXiv:1902.08710, 2019
  38. “BUILDING THE METAMIDI DATASET: LINKING SYMBOLIC AND AUDIO MUSICAL DATA” In Proc. Int’l Society for Music Information Retrieval Conference (ISMIR), 2021
  39. “Mmm: Exploring conditional multi-track music generation with the transformer” In arXiv preprint arXiv:2008.06048, 2020
  40. “Probabilistic grammars and their applications” In International Encyclopedia of the Social & Behavioral Sciences 2002 Pergamon Oxford, 2002, pp. 12075–12082
  41. “Learning to groove with inverse sequence transformations” In International Conference on Machine Learning (ICML), 2019
  42. Jon Gillick, Kevin Tang and Robert M Keller “Machine learning of jazz grammars” In Computer Music Journal 34.3 MIT Press, 2010, pp. 56–66
  43. K. Goel, R. Vohra and J. K. Sahoo “Polyphonic music generation by modeling temporal dependencies using a rnn-dbn” In International Conference on Artificial Neural Networks, 2014, pp. 217–224 Springer
  44. “Generative adversarial networks” In arXiv preprint arXiv:1406.2661, 2014
  45. F. Guan, C. Yu and S. Yang “A GAN model with self-attention mechanism to generate multi-instruments symbolic music” In 2019 International Joint Conference on Neural Networks (IJCNN), 2019, pp. 1–6 IEEE
  46. Gaëtan Hadjeres, Frank Nielsen and François Pachet “GLSR-VAE: Geodesic latent space regularization for variational autoencoder architectures” In 2017 IEEE Symposium Series on Computational Intelligence (SSCI), 2017, pp. 1–7 IEEE
  47. Gaëtan Hadjeres, François Pachet and Frank Nielsen “Deepbach: a steerable model for bach chorales generation” In International Conference on Machine Learning, 2017, pp. 1362–1371 PMLR
  48. “Enabling Factorized Piano Music Modeling and Generation with the MAESTRO Dataset” In International Conference on Learning Representations, 2019 URL: https://openreview.net/forum?id=r1lYRjC9F7
  49. “Onsets and Frames: Dual-Objective Piano Transcription”, 2018 arXiv:1710.11153 [cs.SD]
  50. “Recognition of music in long-term memory: Are melodic and temporal patterns equal partners?” In Memory & cognition 25.4 Springer, 1997, pp. 518–533
  51. “The Hitch-Hiker’s Guide to Evolutionary Computation”, 2001 URL: ftp://rtfm.mit.edu/pub/usenet/news.answers/ai-faq/genetic/
  52. H. Hild, J. Feulner and W. Menzel “HARMONET: A neural net for harmonizing chorales in the style of JS Bach” In Advances in neural information processing systems, 1992, pp. 267–274
  53. “Generating a Complete Multipart Musical Composition from a Single Monophonic Melody with Functional Scaffolding.” In ICCC, 2012, pp. 111–118 Citeseer
  54. Andrew Horner “Evolution in Digital Audio Technology” In Evolutionary Computer Music Springer, 2007, pp. 52–78 DOI: 10.1007/978-1-84628-600-1˙3
  55. “Deep learning for music” In arXiv preprint arXiv:1606.04930, 2016
  56. “Music transformer: Generating music with long-term structure” In International Conference on Learning Representations, 2018
  57. “Music Transformer”, 2018 arXiv:1809.04281 [cs.LG]
  58. “Pop music transformer: Generating music with rhythm and harmony” In arXiv preprint arXiv:2002.00212, 2020
  59. “Deep generative models for musical audio synthesis”, 2020 arXiv:2006.06426 [eess.AS]
  60. “Modeling self-repetition in music generation using generative adversarial networks” In Machine Learning for Music Discovery Workshop, ICML, 2019
  61. D. D. Johnson “Generating polyphonic music using tied parallel networks” In International conference on evolutionary and biologically inspired music and art, 2017, pp. 128–143 Springer
  62. Anna Jordanous “Four PPPPerspectives on Computational Creativity in Theory and in Practice” In Connection Science 28.2 Taylor & Francis, 2016, pp. 194–216 DOI: 10/ghmd54
  63. “Music generation with deep learning” In arXiv preprint arXiv:1612.04928, 2016
  64. Stefano Kalonaris “Re Sound Art Machines and Aesthetics” In Proceedings of Art Machines 2: International Symposium on Machine Learning and Art 2021, 2021
  65. Thayabaran Kathiresan “Automatic Melody Generation”, 2015
  66. “A grammatical approach to automatic improvisation” In Proceedings of the 4th Sound and Music Computing Conference, SMC 2007, 2007
  67. “Neural music synthesis for flexible timbre control” In ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2019, pp. 176–180 IEEE
  68. D. P Kingma and M. Welling “Stochastic gradient VB and the variational auto-encoder” In Second International Conference on Learning Representations, ICLR 19, 2014
  69. “Polyphonic music generation with sequence generative adversarial networks” In arXiv preprint arXiv:1710.11418, 2017
  70. Fred Lerdahl and Ray S Jackendoff “A Generative Theory of Tonal Music, reissue, with a new preface” MIT press, 1983
  71. “Automatic Stylistic Composition of Bach Chorales with Deep LSTM.” In ISMIR, 2017, pp. 449–456
  72. X. Liang, J. Wu and J. Cao “MIDI-Sandwich2: RNN-based Hierarchical Multi-modal Fusion Generation VAE networks for multi-track symbolic music generation” In arXiv preprint arXiv:1909.03522, 2019
  73. X. Liang, J. Wu and Y. Yin “MIDI-Sandwich: Multi-model Multi-task Hierarchical Conditional VAE-GAN networks for Symbolic Single-track Music Generation” In arXiv preprint arXiv:1907.01607, 2019
  74. Hyungui Lim, Seungyeon Rhyu and Kyogu Lee “Chord generation from symbolic melody using BLSTM networks” In arXiv preprint arXiv:1712.01011, 2017
  75. H. M. Liu and Y. H. Yang “Lead sheet generation and arrangement by conditional generative adversarial network” In 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA), 2018, pp. 722–727 IEEE
  76. “Modelling high-dimensional sequences with lstm-rtrbm: Application to polyphonic music generation” In Twenty-Fourth International Joint Conference on Artificial Intelligence, 2015
  77. S. Madjiheurem, L. Qu and C. Walder “Chord2vec: Learning musical chord embeddings” In Proceedings of the constructive machine learning workshop at 30th conference on neural information processing systems (NIPS2016), Barcelona, Spain, 2016
  78. Sephora Madjiheurem, Lizhen Qu and Christian Walder “Chord2vec: Learning musical chord embeddings” In Proceedings of the constructive machine learning workshop at 30th conference on neural information processing systems (NIPS2016), Barcelona, Spain, 2016
  79. “LSTM Based Music Generation System” In arXiv preprint arXiv:1908.01080, 2019
  80. “An end to end model for automatic music generation: Combining deep raw and symbolic audio networks” In Proceedings of the Musical Metacreation Workshop at 9th International Conference on Computational Creativity, Salamanca, Spain, 2018
  81. “Conditioning deep generative raw audio models for structured automatic music” In arXiv preprint arXiv:1806.09905, 2018
  82. H. H. Mao, T. Shin and G. Cottrell “DeepJ: Style-specific music generation” In 2018 IEEE 12th International Conference on Semantic Computing (ICSC), 2018, pp. 377–382 IEEE
  83. “Convolutional Generative Adversarial Network, via Transfer Learning, for Traditional Scottish Music Generation” In Artificial Intelligence in Music, Sound, Art and Design, Lecture Notes in Computer Science Springer International Publishing, 2021, pp. 187–202 DOI: 10.1007/978-3-030-72914-1˙13
  84. Yuval Marom “Improvising Jazz With Markov Chains”, 1997
  85. Robert Neil McArthur and Charles Patrick Martin “An Application for Evolutionary Music Composition Using Autoencoders” In Artificial Intelligence in Music, Sound, Art and Design, Lecture Notes in Computer Science Springer International Publishing, 2021, pp. 443–458 DOI: 10.1007/978-3-030-72914-1˙29
  86. “Hierarchical Timbre-Painting and Articulation Generation” In arXiv preprint arXiv:2008.13095, 2020
  87. Eduardo Miranda “At the Crossroads of Evolutionary Computation and Music: Self-Programming Synthesizers, Swarm Orchestras and the Origins of Melody” In Evolutionary Computation 12.2, 2004, pp. 137–158 DOI: 10/brg8qw
  88. M. Modrzejewski, M. Dorobek and P. Rokita “Application of Deep Neural Networks to Music Composition Based on MIDI Datasets and Graphical Representation” In International Conference on Artificial Intelligence and Soft Computing, 2019, pp. 143–152 Springer
  89. R. A. Moog “Midi: Musical instrument digital interface” In Journal of the Audio Engineering Society 34.5 Audio Engineering Society, 1986, pp. 394–404
  90. “Autoencoder-based music translation” In International Conference on Learning Representations, 2019
  91. “Tree-structured probabilistic model of monophonic written music based on the generative theory of tonal music” In 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2016, pp. 276–280 IEEE
  92. B. Nettl “Music” In Grove Music Online Oxford University Press, 2014
  93. M. Norgaard, M. Montiel and J. Spencer “Chords not required : Incorporating horizontal and vertical aspects independently in a computer improvisation algorithm” Proceedings of the International Symposium on Performance Science, 2013, pp. 725–730
  94. “Wavenet: A generative model for raw audio” In arXiv preprint arXiv:1609.03499, 2016
  95. Christine Payne “MuseNet” OpenAI blog post, 2019 URL: \url{https://openai.com/blog/musenet}
  96. “Towards Automated Counter-Melody Generation for Monophonic Melodies” In Proceedings of the 2017 International Conference on Machine Learning and Soft Computing, 2017, pp. 197–202
  97. L. R. Rabiner and B. H. Juang “An introduction to hidden Markov models” In IEEE ASSp Magazine, 1986
  98. Colin Raffel “The lakh midi dataset v0. 1”, 2016
  99. S. Raja “Music Generation with Temporal Structure Augmentation” In arXiv preprint arXiv:2004.10246, 2020
  100. “Popmag: Pop music accompaniment generation” In Proceedings of the 28th ACM International Conference on Multimedia, 2020, pp. 1198–1206
  101. Graeme Ritchie “The Evaluation of Creative Systems” In Computational Creativity: The Philosophy and Engineering of Autonomously Creative Systems, Computational Synthesis and Creative Systems Cham: Springer International Publishing, 2019, pp. 159–194 DOI: 10.1007/978-3-319-43610-4˙8
  102. “Grammars as Representations for Music” In Computer Music Journal 3.1 The MIT Press, 1979, pp. 48–55
  103. “A hierarchical latent vector model for learning long-term structure in music” In International Conference on Machine Learning, 2018, pp. 4364–4373 PMLR
  104. “Learning Latent Representations of Music to Generate Interactive Musical Palettes.” In IUI Workshops, 2018
  105. P. Roy, A. Papadopoulos and F. Pachet “Sampling variations of lead sheets” In arXiv preprint arXiv:1703.00760, 2017
  106. Álvaro Sánchez Hidalgo “Generation of jazz improvisations in MATLAB”, 2017
  107. E.G. Schukat-Talamazzini “Automatische Spracherkennung: Grundlagen, statistische Modelle und effiziente Algorithmen”, Künstliche Intelligenz Braunschweig/Wiesbaden: W. BibelW. von Hahn, Vieweg Verlag, 1995
  108. Ian Simon, Dan Morris and Sumit Basu “MySong: Automatic Accompaniment Generation for Vocal Melodies” In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems Association for Computing Machinery, 2008, pp. 725–734 DOI: 10.1145/1357054.1357169
  109. “Animal music” In Grove Music Online, 2014
  110. César Souza “Hidden Markov Models in C#” http://crsouza.com/2010/03/23/hidden-markov-models-in-c/, visited 2021-06-02, 2010
  111. J. St. George and H. Peter. Bischof “Music Style Transformer” In Proceedings on the International Conference on Artificial Intelligence (ICAI), 2019, pp. 22–33 The Steering Committee of The World Congress in Computer Science, Computer …
  112. Hao Hao Tan and Dorien Herremans “Music fadernets: Controllable music generation based on high-level features via low-level feature modelling” In arXiv preprint arXiv:2007.15474, 2020
  113. Yifei Teng, An Zhao and Camille Goudeseune “Generating nontrivial melodies for music as a service” In arXiv preprint arXiv:1710.02280, 2017
  114. J. Thickstun, Z. Harchaoui and S. Kakade “Learning features of music from scratch” In arXiv preprint arXiv:1611.09827, 2016
  115. V. Tiwari, P. Shivaprasad and R. Rushikesh “Polyphonic Music Generation” In Available at SSRN 3558389, 2020
  116. Sebastian Trump “Sound Cells in Genetic Improvisation: An Evolutionary Model for Improvised Music” In Artificial Intelligence in Music, Sound, Art and Design, Lecture Notes in Computer Science Cham: Springer International Publishing, 2020, pp. 179–193 DOI: 10/ggrw2r
  117. “Spirio Sessions: Experiments in Human-Machine Improvisation with a Digital Player Piano” In Proceedings of the 2nd Joint Conference on AI Music Creativity, 2021
  118. Alan Turing “Computing Machinery and Intelligence” In Mind 59.236, 1950, pp. 433–460
  119. “Attention is All You Need”, 2017 URL: https://arxiv.org/pdf/1706.03762.pdf
  120. E. Waite “Generating long-term structure in songs and stories” In Web blog post. Magenta 15.4, 2016
  121. N. L. Wallin, B. Merker and S. Brown “The origins of music” MIT press, 2001
  122. C. Walshaw “The ABC musical notation language”, 2000
  123. B. Wang and Y. H. Yang “PerformanceNet: Score-to-audio music generation with multi-band convolutional residual network” In Proceedings of the AAAI Conference on Artificial Intelligence 33, 2019, pp. 1174–1181
  124. Z. Wang, S. Zhang and X. Chen “Exploring Inherent Properties of the Monophonic Melody of Songs” In arXiv preprint arXiv:2003.09287, 2020
  125. “Learning interpretable representation for controllable polyphonic music generation” In arXiv preprint arXiv:2008.07122, 2020
  126. “The Jazz Transformer on the Front Line: Exploring the Shortcomings of AI-composed Music through Quantitative Measures”, 2020 arXiv:2008.01307 [cs.SD]
  127. L. C. Yang, S. Y. Chou and Y. H. Yang “MidiNet: A convolutional generative adversarial network for symbolic-domain music generation” In arXiv preprint arXiv:1703.10847, 2017
  128. Ning Zhang “Learning Adversarial Transformer for Symbolic Music Generation” In IEEE Transactions on Neural Networks and Learning Systems, 2020, pp. 1–10 DOI: 10.1109/TNNLS.2020.2990746
  129. “A Review of Intelligent Music Generation Systems”, 2022 arXiv:2211.09124 [cs.SD]
  130. “Music2Dance: DanceNet for Music-driven Dance Generation” In arXiv e-prints, 2020, pp. arXiv–2002
  131. U. Zölzer “Digital audio signal processing” Wiley Online Library, 2008

Summary

We haven't generated a summary for this paper yet.