Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

MERGE -- A Bimodal Dataset for Static Music Emotion Recognition (2407.06060v1)

Published 8 Jul 2024 in cs.SD, cs.IR, cs.LG, cs.MM, and eess.AS

Abstract: The Music Emotion Recognition (MER) field has seen steady developments in recent years, with contributions from feature engineering, machine learning, and deep learning. The landscape has also shifted from audio-centric systems to bimodal ensembles that combine audio and lyrics. However, a severe lack of public and sizeable bimodal databases has hampered the development and improvement of bimodal audio-lyrics systems. This article proposes three new audio, lyrics, and bimodal MER research datasets, collectively called MERGE, created using a semi-automatic approach. To comprehensively assess the proposed datasets and establish a baseline for benchmarking, we conducted several experiments for each modality, using feature engineering, machine learning, and deep learning methodologies. In addition, we propose and validate fixed train-validate-test splits. The obtained results confirm the viability of the proposed datasets, achieving the best overall result of 79.21% F1-score for bimodal classification using a deep neural network.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Pedro Lima Louro (1 paper)
  2. Hugo Redinho (1 paper)
  3. Ricardo Santos (7 papers)
  4. Ricardo Malheiro (1 paper)
  5. Renato Panda (1 paper)
  6. Rui Pedro Paiva (5 papers)
Citations (2)

Summary

We haven't generated a summary for this paper yet.