Self-Supervised Beat Tracking in Musical Signals with Polyphonic Contrastive Learning (2201.01771v2)

Published 5 Jan 2022 in cs.SD, cs.LG, and eess.AS

Abstract: Annotating musical beats is a very long and tedious process. In order to combat this problem, we present a new self-supervised learning pretext task for beat tracking and downbeat estimation. This task makes use of Spleeter, an audio source separation model, to separate a song's drums from the rest of its signal. The first set of signals are used as positives, and by extension negatives, for contrastive learning pre-training. The drum-less signals, on the other hand, are used as anchors. When pre-training a fully-convolutional and recurrent model using this pretext task, an onset function is learned. In some cases, this function is found to be mapped to periodic elements in a song. We find that pre-trained models outperform randomly initialized models when a beat tracking training set is extremely small (less than 10 examples). When this is not the case, pre-training leads to a learning speed-up that causes the model to overfit to the training set. More generally, this work defines new perspectives in the realm of musical self-supervised learning. It is notably one of the first works to use audio source separation as a fundamental component of self-supervision.

Citations (2)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Self-Supervised Beat Tracking in Musical Signals with Polyphonic Contrastive Learning (2201.01771v2)

Summary

Related Papers