Evaluation metrics for flexible but accurate music–multimodal alignment
Develop objective evaluation metrics for cross-modal alignment between music and other modalities (such as video) that simultaneously tolerate multiple valid pairings for a given input while accurately quantifying alignment quality, thereby balancing flexibility in artistic pairing with rigorous alignment assessment.
Sponsor
References
For example, a single video may be effectively paired with multiple music tracks, each creating a distinct but coherent audiovisual experience. This raises an open challenge: how to design evaluation metrics that balance tolerance for flexible pairings with the need for alignment accuracy.
— A Survey on Cross-Modal Interaction Between Music and Multimodal Data
(2504.12796 - Li et al., 17 Apr 2025) in Subsection "Evaluation", Section 6 (Dataset and Evaluation)