Perception-Inspired Graph Convolution for Music Understanding Tasks (2405.09224v1)
Abstract: We propose a new graph convolutional block, called MusGConv, specifically designed for the efficient processing of musical score data and motivated by general perceptual principles. It focuses on two fundamental dimensions of music, pitch and rhythm, and considers both relative and absolute representations of these components. We evaluate our approach on four different musical understanding problems: monophonic voice separation, harmonic analysis, cadence detection, and composer identification which, in abstract terms, translate to different graph learning problems, namely, node classification, link prediction, and graph classification. Our experiments demonstrate that MusGConv improves the performance on three of the aforementioned tasks while being conceptually very simple and efficient. We interpret this as evidence that it is beneficial to include perception-informed processing of fundamental musical concepts when developing graph network applications on musical score data.
- Learning sonata form structure on mozart’s string quartets. Transactions of the International Society for Music Information Retrieval (TISMIR), 2(1):82–96, 2019.
- Point convolutional neural networks by extension operators. ACM Transactions on Graphics, 37(4), 2018.
- Residual gated graph convnets. In Proceedings of the International Conference on Learning Representations, 2017.
- Smote: synthetic minority over-sampling technique. Journal of artificial intelligence research, 16:321–357, 2002.
- Functional Harmony Recognition of Symbolic Music Data with Multi-task Recurrent Neural Networks. In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), 2018.
- An optimal transportation approach for assessing almost stochastic order. In The Mathematics of the Uncertain, pages 33–44. Springer, 2018.
- Diana Deutsch. Psychology of music. Elsevier, 2013.
- Downbeat tracking with tempo-invariant convolutional neural networks. In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), 2021.
- Streaming from midi using constraint satisfaction optimization and sequence alignment. In Proceedings of the International Computer Music Conference (ICMC), 2009.
- Modeling Music Modality with a Key-Class Invariant Pitch Chroma CNN. In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), 2019.
- Pkspell: Data-driven pitch spelling and key signature estimation. In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), 2021.
- The match file format: Encoding alignments between scores and performances. In Proceedings of the Music Encoding Conference (MEC), 2022.
- MidiTok: A python package for MIDI file tokenization. In Late-Breaking Demo Session of the International Society for Music Information Retrieval Conference (ISMIR), 2021.
- Computational Fugue Analysis. Computer Music Journal, 39(2):77–96, 2015.
- Inductive representation learning on large graphs. Advances in Neural Information Processing Systems, 30, 2017.
- Christopher Harte. Towards automatic extraction of harmony information from music signals. PhD thesis, Queen Mary, University of London, 2010.
- The annotated mozart sonatas: Score, harmony, and cadence. Transactions of the International Society for Music Information Retrieval (TISMIR), 4(1), May 2021.
- Yo-Wei Hsiao and Li Su. Learning note-to-note affinity for voice segregation and melody line identification of symbolic music data. In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), pages 285–292, 2021.
- Strategies for pre-training graph neural networks. In International Conference on Learning Representations (ICLR), 2019.
- Graph neural network for music score data and modeling expressive piano performance. In Proceedings of the International Conference on Machine Learning (ICML), pages 3060–3070. PMLR, 2019.
- Cadence detection in symbolic classical music using graph neural networks. In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), 2022.
- Roman Numeral Analysis with Graph Neural Networks: Onset-wise Predictions from Note-wise Features. In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), 2023.
- Musical Voice Separation as Link Prediction: Modeling a Musical Perception Task as a Multi-Trajectory Tracking Problem. In Proceedings of the Joint Conference on Atrificial Intelligence (IJCAI), 2023.
- Improving tokenization expressiveness with pitch intervals. In Late-Breaking Demo Session of the International Society for Music Information Retrieval Conference (ISMIR), 2022.
- Signal Processing Methods for Music Transcription. Springer, 2006.
- Learning Transposition-Invariant Interval Features from Symbolic Music and Audio. In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), 2018.
- Learning Complex Basis Functions for Invariant Representations of Audio. In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), 2019.
- AugmentedNet: A Roman Numeral Analysis Network with Synthetic Training Examples and Additional Tonal Tasks. In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), 2021.
- A modular system for the harmonic analysis of musical scores using a large vocabulary. In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), 2021.
- Not all roads lead to Rome: Pitch representation and model architecture for automatic harmonic analysis. Transactions of the International Society for Music Information Retrieval (TISMIR), 3(1):42–54, 2020.
- Tree-structured probabilistic model of monophonic written music based on the generative theory of tonal music. In Proceedings of the International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 276–280. IEEE, 2016.
- Marcus Pearce. The Construction and Evaluation of Statistical Models of Melodic Structure in Music Perception and Composition. PhD thesis, City University of London, UK, 2005.
- Marcus T. Pearce. Statistical learning and probabilistic prediction in music cognition: Mechanisms of stylistic enculturation. Annals of the New York Academy of Sciences, 1423(1):378–395, 2018.
- E(n) equivariant graph neural networks. In Proceedings of the International Conference on Machine Learning (ICML), 2021.
- Modeling relational data with graph convolutional networks. In Proceedings of the Semantic Web International Conference, ESWC, volume 10843 of Lecture Notes in Computer Science, pages 593–607. Springer, 2018.
- Modeling harmony with skip-grams. In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), 2017.
- Dynamic graph CNN for learning on point clouds. ACM Transactions on Graphics, 38(5):146:1–146:12, 2019.
- Dynamic graph cnn for learning on point clouds. ACM Transactions on Graphics (tog), 38(5):1–12, 2019.
- A comprehensive survey on graph neural networks. IEEE Transactions on Neural Networks and Learning Systems, 32(1):4–24, 2021.
- Symbolic music representations for classification tasks: A systematic evaluation. In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), 2023.
- Layer-dependent importance sampling for training deep and large graph convolutional networks. Advances in neural information processing systems, 32, 2019.