Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

DadaGP: A Dataset of Tokenized GuitarPro Songs for Sequence Models (2107.14653v1)

Published 30 Jul 2021 in cs.SD, cs.LG, and eess.AS

Abstract: Originating in the Renaissance and burgeoning in the digital era, tablatures are a commonly used music notation system which provides explicit representations of instrument fingerings rather than pitches. GuitarPro has established itself as a widely used tablature format and software enabling musicians to edit and share songs for musical practice, learning, and composition. In this work, we present DadaGP, a new symbolic music dataset comprising 26,181 song scores in the GuitarPro format covering 739 musical genres, along with an accompanying tokenized format well-suited for generative sequence models such as the Transformer. The tokenized format is inspired by event-based MIDI encodings, often used in symbolic music generation models. The dataset is released with an encoder/decoder which converts GuitarPro files to tokens and back. We present results of a use case in which DadaGP is used to train a Transformer-based model to generate new songs in GuitarPro format. We discuss other relevant use cases for the dataset (guitar-bass transcription, music style transfer and artist/genre classification) as well as ethical implications. DadaGP opens up the possibility to train GuitarPro score generators, fine-tune models on custom data, create new styles of music, AI-powered songwriting apps, and human-AI improvisation.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Pedro Sarmento (14 papers)
  2. Adarsh Kumar (26 papers)
  3. CJ Carr (11 papers)
  4. Zack Zukowski (10 papers)
  5. Mathieu Barthet (15 papers)
  6. Yi-Hsuan Yang (89 papers)
Citations (25)

Summary

We haven't generated a summary for this paper yet.