BEAST: Online Joint Beat and Downbeat Tracking Based on Streaming Transformer (2312.17156v3)
Abstract: Many deep learning models have achieved dominant performance on the offline beat tracking task. However, online beat tracking, in which only the past and present input features are available, still remains challenging. In this paper, we propose BEAt tracking Streaming Transformer (BEAST), an online joint beat and downbeat tracking system based on the streaming Transformer. To deal with online scenarios, BEAST applies contextual block processing in the Transformer encoder. Moreover, we adopt relative positional encoding in the attention layer of the streaming Transformer encoder to capture relative timing position which is critically important information in music. Carrying out beat and downbeat experiments on benchmark datasets for a low latency scenario with maximum latency under 50 ms, BEAST achieves an F1-measure of 80.04% in beat and 46.78% in downbeat, which is a substantial improvement of about 5 percentage points over the state-of-the-art online beat tracking model.
- “Source separation-based data augmentation for improved joint beat and downbeat tracking,” in EUSIPCO, 2021, pp. 391–395.
- “A music structure informed downbeat tracing system using skip-chain conditional random fields and deep learning,” in ICASSP, 2019, pp. 481–485.
- T.-P. Chen and L. Su, “Toward postprocessing-free neural networks for joint beat and downbeat estimation,” in ISMIR, 2022.
- “A multi-model approach to beat tracking considering heterogeneous music styles,” in ISMIR, 2014, pp. 603–608.
- S. Böck and M. E. P. Davies, “Deconstruct, analyse, reconstruct: How to improve tempo, beat, and downbeat estimation,” in ISMIR, 2020, pp. 574–582.
- “Beat Transormer: Demixed beat and downbeat tracking with dilated self-attention,” in ISMIR, 2022.
- “Modeling beats and downbeats with a time-frequency transformer,” in ICASSP, 2022, pp. 401–405.
- “Attention is all you need,” in NeurIPS, 2017, pp. 5998–6008.
- P. M. Brossier, Automatic annotation of musical audio for interactive applications, Ph.D. thesis, Queen Marry University, London, UK, 2006.
- “Real-time dance generation to music for a legged robot,” in IROS, 2018, pp. 1038–1044.
- “IBT: A real-time tempo and beat tracking system,” in ISMIR, 2014, pp. 291–296.
- “BeatNet: CRNN and particle filtering for online joint beat downbeat and meter tracing,” in ISMIR, 2021.
- “A Novel 1D State Space for Efficient Music Rhythmic Analysis,” in ICASSP, 2022.
- “Transformer asr with contextual block processing,” in ASRU, 2019, pp. 427–433.
- “Transformer-XL: Attentive language models beyond a fixed-length context,” in ACL, 2019, pp. 2978–2988.
- “Music transformer,” in ICLR, 2018.
- “Self-attention aligner: A latency-control end-to-end model for asr using self-attention network and chunk-hopping,” in ICASSP, 2019, pp. 5656–5660.
- “Streaming transformer-based acoustic models using self-attention with augmented memory,” in Interspeech, 2020, pp. 2132–2136.
- “Madmom: A new python audio and music signal processing library,” in ACM MM, 2016, pp. 1174–1178.
- “ROBOD: A real-time online beat and offbeat drummer sebastian ,” in IEEE signal processing cup, 2017.
- “Evaluation methods for musical audio beat tracking algorithms,” Tech. Rep., Centre for Digital Music, Queen Mary University of Londo, 2009.
- “Rhythmic pattern modeling for beat and downbeat tracking in musical audio,” in ISMIR, 2013, pp. 227–232.
- “Particle filtering applied to musical tempo tracking,” in EURASIP JASP, 2004, vol. 2004, pp. 1–11.
- A. Srinivasamurthy and X. Serra, “A supervised approach to hierarchical metrical cycle tracking from audio music recordings,” in ICASSP, 2014, p. 5217–5221.
- “Selective sampling for beat tracking evaluation,” in IEEE TASLP, 2012, vol. 20, p. 2539–2548.
- U. Marchand and G. Peeters, “Swing ratio estimation,” in DAFx, 2015.
- “Self-attention with relative position representations,” arXiv preprint arXiv:1803.02155, 2018.