Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 52 tok/s
Gemini 2.5 Pro 55 tok/s Pro
GPT-5 Medium 25 tok/s Pro
GPT-5 High 26 tok/s Pro
GPT-4o 107 tok/s Pro
Kimi K2 216 tok/s Pro
GPT OSS 120B 468 tok/s Pro
Claude Sonnet 4 39 tok/s Pro
2000 character limit reached

Barwise Music Structure Analysis with the Correlation Block-Matching Segmentation Algorithm (2311.18604v1)

Published 30 Nov 2023 in cs.SD, cs.IR, and eess.AS

Abstract: Music Structure Analysis (MSA) is a Music Information Retrieval task consisting of representing a song in a simplified, organized manner by breaking it down into sections typically corresponding to chorus'',verse'', ``solo'', etc. In this work, we extend an MSA algorithm called the Correlation Block-Matching (CBM) algorithm introduced by (Marmoret et al., 2020, 2022b). The CBM algorithm is a dynamic programming algorithm that segments self-similarity matrices, which are a standard description used in MSA and in numerous other applications. In this work, self-similarity matrices are computed from the feature representation of an audio signal and time is sampled at the bar-scale. This study examines three different standard similarity functions for the computation of self-similarity matrices. Results show that, in optimal conditions, the proposed algorithm achieves a level of performance which is competitive with supervised state-of-the-art methods while only requiring knowledge of bar positions. In addition, the algorithm is made open-source and is highly customizable.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (36)
  1. Bellman, R. (1952). On the theory of dynamic programming. Proceedings of the national Academy of Sciences, 38(8):716–719.
  2. Representation learning: A review and new perspectives. IEEE transactions on pattern analysis and machine intelligence, 35(8):1798–1828.
  3. Deconstruct, analyse, reconstruct: How to improve tempo, beat, and downbeat estimation. In ISMIR, pages 574–582.
  4. Multi-task learning of tempo and beat: Learning one to improve the other. In ISMIR, pages 486–493.
  5. Madmom: A new python audio and music signal processing library. In Proceedings of the 24th ACM international conference on Multimedia, pages 1174–1178.
  6. Joint beat and downbeat tracking with recurrent neural networks. In International Society for Music Information Retrieval Conference (ISMIR), pages 255–261.
  7. Introduction to algorithms. MIT press, 3rd edition.
  8. Unveiling the hierarchical structure of music by multi-resolution community detection. Transactions of the International Society for Music Information Retrieval, 3(1):82–97.
  9. Foote, J. (2000). Automatic audio segmentation using a measure of audio novelty. In 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings Latest Advances in the Fast Changing World of Multimedia, pages 452–455. IEEE.
  10. A music structure informed downbeat tracking system using skip-chain conditional random fields and deep learning. In 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 481–485. IEEE.
  11. RWC Music Database: Popular, Classical and Jazz Music Databases. In International Society for Music Information Retrieval Conference (ISMIR), pages 287–288.
  12. Music boundary detection using neural networks on combined features and two-level annotations. In International Society for Music Information Retrieval Conference (ISMIR), pages 531–537.
  13. Modeling beats and downbeats with a time-frequency transformer. In ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 401–405. IEEE.
  14. Jensen, K. (2006). Multiple scale music segmentation using rhythm, timbre, and harmony. EURASIP Journal on Advances in Signal Processing, 2007:1–11.
  15. as_seg: module for computing and segmenting autosimilarity matrices.
  16. Uncovering audio patterns in music with nonnegative Tucker decomposition for structural segmentation. In International Society for Music Information Retrieval Conference (ISMIR), pages 788–794.
  17. Barwise compression schemes for audio-based music structure analysis. In 19th Sound and Music Computing Conference, SMC 2022. Sound and music Computing network.
  18. Using musical structure to enhance automatic chord transcription. In International Society for Music Information Retrieval Conference (ISMIR), pages 231–236.
  19. McCallum, M. C. (2019). Unsupervised learning of deep features for music segmentation. In 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 346–350. IEEE.
  20. Analyzing song structure with spectral clustering. In International Society for Music Information Retrieval Conference (ISMIR), pages 405–410.
  21. Learning to segment songs with ordinal linear discriminant analysis. In 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 5197–5201. IEEE.
  22. Evaluating hierarchical structure in music annotations. Frontiers in psychology, 8:1337.
  23. Systematic exploration of computational music structure research. In International Society for Music Information Retrieval Conference (ISMIR), pages 547–553.
  24. Audio-based music structure analysis: Current trends, open challenges, and applications. Transactions of the International Society for Music Information Retrieval, 3(1).
  25. Semantic segmentation of music audio. In Proceedings of the 2005 International Computer Music Conference, page 61. Computer Music Association.
  26. Phase-aware joint beat and downbeat estimation based on periodicity of metrical structure. In ISMIR, pages 493–499.
  27. State of the art report: Audio-based music structure analysis. In International Society for Music Information Retrieval Conference (ISMIR), pages 625–636.
  28. Raffel, C. et al. (2014). mir_eval: A transparent implementation of common MIR metrics. In International Society for Music Information Retrieval Conference (ISMIR), pages 367–372.
  29. Deep embeddings and section fusion improve music segmentation. IEEE Signal Processing Letters, 24(3):279–283.
  30. Estimating the structural segmentation of popular music pieces under regularity constraints. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 25(2):344–358.
  31. Unsupervised music structure annotation by time series structure features and segment similarity. IEEE Transactions on Multimedia, 16(5):1229–1240.
  32. Similarity matrix processing for music structure analysis. In Proceedings of the 1st ACM workshop on Audio and music computing multimedia, pages 69–76.
  33. Design and creation of a large-scale database of structural annotations. In International Society for Music Information Retrieval Conference (ISMIR), pages 555–560.
  34. A supervised approach for detecting boundaries in music using difference features and boosting. In International Society for Music Information Retrieval Conference (ISMIR), pages 51–54.
  35. Boundary detection in music structure analysis using convolutional neural networks. In ISMIR, pages 417–422.
  36. Supervised metric learning for music structure feature. In International Society for Music Information Retrieval Conference (ISMIR), pages 730–737.
List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-Up Questions

We haven't generated follow-up questions for this paper yet.