Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Listen to Your Favorite Melodies with img2Mxml, Producing MusicXML from Sheet Music Image by Measure-based Multimodal Deep Learning-driven Assembly (2106.12037v1)

Published 16 Jun 2021 in cs.CV

Abstract: Deep learning has recently been applied to optical music recognition (OMR). However, currently OMR processing from various sheet music images still lacks precision to be widely applicable. Here, we present an MMdA (Measure-based Multimodal deep learning (DL)-driven Assembly) method allowing for end-to-end OMR processing from various images including inclined photo images. Using this method, measures are extracted by a deep learning model, aligned, and resized to be used for inference of given musical symbol components by using multiple deep learning models in sequence or in parallel. Use of each standardized measure enables efficient training of the models and accurate adjustment of five staff lines in each measure. Multiple musical symbol component category models with a small number of feature types can represent a diverse set of notes and other musical symbols including chords. This MMdA method provides a solution to end-to-end OMR processing with precision.

Citations (1)

Summary

We haven't generated a summary for this paper yet.