Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Non-Local Musical Statistics as Guides for Audio-to-Score Piano Transcription (2008.12710v3)

Published 28 Aug 2020 in cs.SD and eess.AS

Abstract: We present an automatic piano transcription system that converts polyphonic audio recordings into musical scores. This has been a long-standing problem of music information processing, and recent studies have made remarkable progress in the two main component techniques: multipitch detection and rhythm quantization. Given this situation, we study a method integrating deep-neural-network-based multipitch detection and statistical-model-based rhythm quantization. In the first part, we conducted systematic evaluations and found that while the present method achieved high transcription accuracies at the note level, some global characteristics of music, such as tempo scale, metre (time signature), and bar line positions, were often incorrectly estimated. In the second part, we formulated non-local statistics of pitch and rhythmic contents that are derived from musical knowledge and studied their effects in inferring those global characteristics. We found that these statistics are markedly effective for improving the transcription results and that their optimal combination includes statistics obtained from separated hand parts. The integrated method had an overall transcription error rate of 7.1% and a downbeat F-measure of 85.6% on a dataset of popular piano music, and the generated transcriptions can be partially used for music performance and assisting human transcribers, thus demonstrating the potential for practical applications.

Citations (24)

Summary

We haven't generated a summary for this paper yet.