Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

MuChin: A Chinese Colloquial Description Benchmark for Evaluating Language Models in the Field of Music (2402.09871v4)

Published 15 Feb 2024 in cs.SD, cs.AI, cs.MM, and eess.AS

Abstract: The rapidly evolving multimodal LLMs urgently require new benchmarks to uniformly evaluate their performance on understanding and textually describing music. However, due to semantic gaps between Music Information Retrieval (MIR) algorithms and human understanding, discrepancies between professionals and the public, and low precision of annotations, existing music description datasets cannot serve as benchmarks. To this end, we present MuChin, the first open-source music description benchmark in Chinese colloquial language, designed to evaluate the performance of multimodal LLMs in understanding and describing music. We established the Caichong Music Annotation Platform (CaiMAP) that employs an innovative multi-person, multi-stage assurance method, and recruited both amateurs and professionals to ensure the precision of annotations and alignment with popular semantics. Utilizing this method, we built a dataset with multi-dimensional, high-precision music annotations, the Caichong Music Dataset (CaiMD), and carefully selected 1,000 high-quality entries to serve as the test set for MuChin. Based on MuChin, we analyzed the discrepancies between professionals and amateurs in terms of music description, and empirically demonstrated the effectiveness of annotated data for fine-tuning LLMs. Ultimately, we employed MuChin to evaluate existing music understanding models on their ability to provide colloquial descriptions of music. All data related to the benchmark, along with the scoring code and detailed appendices, have been open-sourced (https://github.com/CarlWangChina/MuChin/).

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (9)
  1. Zihao Wang (216 papers)
  2. Shuyu Li (5 papers)
  3. Tao Zhang (482 papers)
  4. Pengfei Yu (21 papers)
  5. Jinyang Luo (1 paper)
  6. Yan Liu (421 papers)
  7. Ming Xi (4 papers)
  8. Kejun Zhang (26 papers)
  9. Qi Wang (561 papers)
Citations (3)

Summary

We haven't generated a summary for this paper yet.