Generalizing MIDI Velocity Estimation Across Diverse Pianos

Determine whether automatic music transcription systems that accurately estimate MIDI velocity from Yamaha Disklavier piano performances can generalize this MIDI velocity estimation capability reliably across diverse acoustic pianos, ensuring robust performance independent of instrument-specific timbre and touch characteristics.

Background

In computational music analysis, MIDI velocity is often used as a proxy for perceived dynamics, but it is influenced by instrument-specific timbre and touch, which challenges generalization beyond the training domain. Existing automatic music transcription systems can estimate MIDI velocity accurately for Yamaha Disklavier recordings, a controlled instrument setup commonly used in research.

The paper highlights that extending this capability to diverse acoustic pianos remains unresolved. This limitation undermines pipelines that depend on MIDI velocity for inferring dynamics and motivates end-to-end approaches that estimate dynamics directly from audio without relying on MIDI velocity as an intermediary.

References

While automatic music transcription (AMT) systems can accurately estimate MIDI velocity from Yamaha Disklavier piano performances, generalizing this capability across diverse pianos remains unsolved [edwards2024general].

Joint Estimation of Piano Dynamics and Metrical Structure with a Multi-task Multi-Scale Network (2510.18190 - He et al., 21 Oct 2025) in Section 1 (Introduction)