Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Unsupervised Lead Sheet Generation via Semantic Compression (2310.10772v1)

Published 16 Oct 2023 in cs.SD, cs.LG, cs.MM, and eess.AS

Abstract: Lead sheets have become commonplace in generative music research, being used as an initial compressed representation for downstream tasks like multitrack music generation and automatic arrangement. Despite this, researchers have often fallen back on deterministic reduction methods (such as the skyline algorithm) to generate lead sheets when seeking paired lead sheets and full scores, with little attention being paid toward the quality of the lead sheets themselves and how they accurately reflect their orchestrated counterparts. To address these issues, we propose the problem of conditional lead sheet generation (i.e. generating a lead sheet given its full score version), and show that this task can be formulated as an unsupervised music compression task, where the lead sheet represents a compressed latent version of the score. We introduce a novel model, called Lead-AE, that models the lead sheets as a discrete subselection of the original sequence, using a differentiable top-k operator to allow for controllable local sparsity constraints. Across both automatic proxy tasks and direct human evaluations, we find that our method improves upon the established deterministic baseline and produces coherent reductions of large multitrack scores.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (25)
  1. “Rhythm, chord and melody generation for lead sheets using recurrent neural networks,” in Machine Learning and Knowledge Discovery in Databases: ECML PKDD. Springer, 2020.
  2. “Generating lead sheets with affect: A novel conditional seq2seq framework,” in 2021 International Joint Conference on Neural Networks (IJCNN). IEEE, 2021.
  3. “The jazz transformer on the front line: Exploring the shortcomings of ai-composed music through quantitative measures,” arXiv preprint arXiv:2008.01307, 2020.
  4. “Accomontage: Accompaniment arrangement via phrase selection and style transfer,” arXiv preprint arXiv:2108.11213, 2021.
  5. “Accomontage2: A complete harmonization and accompaniment arrangement system,” arXiv preprint arXiv:2209.00353, 2022.
  6. “Compose & embellish: Well-structured piano performance generation via a two-stage approach,” in International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2023.
  7. “Controllable deep melody generation via hierarchical music structure representation,” arXiv preprint arXiv:2109.00663, 2021.
  8. “Pop909: A pop-song dataset for music arrangement generation,” arXiv preprint arXiv:2008.07142, 2020.
  9. “Anticipatory music transformer,” arXiv preprint arXiv:2306.08620, 2023.
  10. “manrt-piano: large-scale pre-training for symbolic music understanding,” arXiv preprint arXiv:2107.05223, 2021.
  11. Yo-Wei Hsiao and Li Su, “Learning note-to-note affinity for voice segregation and melody line identification of symbolic music data.,” in ISMIR, 2021, pp. 285–292.
  12. “Narrative text generation with a latent discrete plan,” arXiv preprint arXiv:2010.03272, 2020.
  13. “A probabilistic formulation of unsupervised text style transfer,” arXiv preprint arXiv:2002.03912, 2020.
  14. “Style pooling: Automatic text style obfuscation for improved classification fairness,” arXiv preprint arXiv:2109.04624, 2021.
  15. “Melons: generating melody with long-term structure using transformers and structure graph,” in International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2022.
  16. “The weimar jazz database,” https://jazzomat.hfm-weimar.de/.
  17. “Multitrack music transformer,” in International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2023.
  18. “Figaro: Generating symbolic music with fine-grained artistic control,” arXiv preprint arXiv:2201.10936, 2022.
  19. “Large-vocabulary chord transcription via chord structure decomposition.,” in ISMIR, 2019, pp. 644–651.
  20. “Structured training for large-vocabulary chord recognition.,” in ISMIR, 2017, pp. 188–194.
  21. “Attention is all you need,” Advances in Neural Information Processing Systems, 2017.
  22. “Reparameterizable subset sampling via continuous relaxations,” arXiv preprint arXiv:1901.10517, 2019.
  23. “Categorical reparameterization with gumbel-softmax,” arXiv preprint arXiv:1611.01144, 2016.
  24. “A database linking piano and orchestral midi scores with application to automatic projective orchestration,” in ISMIR, 2017.
  25. “Music translation: Generating piano arrangements in different playing levels,” in ISMIR, 2022.

Summary

We haven't generated a summary for this paper yet.