Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

FMA: A Dataset For Music Analysis (1612.01840v3)

Published 6 Dec 2016 in cs.SD and cs.IR

Abstract: We introduce the Free Music Archive (FMA), an open and easily accessible dataset suitable for evaluating several tasks in MIR, a field concerned with browsing, searching, and organizing large music collections. The community's growing interest in feature and end-to-end learning is however restrained by the limited availability of large audio datasets. The FMA aims to overcome this hurdle by providing 917 GiB and 343 days of Creative Commons-licensed audio from 106,574 tracks from 16,341 artists and 14,854 albums, arranged in a hierarchical taxonomy of 161 genres. It provides full-length and high-quality audio, pre-computed features, together with track- and user-level metadata, tags, and free-form text such as biographies. We here describe the dataset and how it was created, propose a train/validation/test split and three subsets, discuss some suitable MIR tasks, and evaluate some baselines for genre recognition. Code, data, and usage examples are available at https://github.com/mdeff/fma

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Michaƫl Defferrard (13 papers)
  2. Kirell Benzi (5 papers)
  3. Pierre Vandergheynst (72 papers)
  4. Xavier Bresson (40 papers)
Citations (390)

Summary

Analysis of the "FMA: A Dataset For Music Analysis" Paper

The paper "FMA: A Dataset For Music Analysis" presents an extensive dataset aimed at addressing the paucity of large-scale, openly accessible audio datasets in the field of Music Information Retrieval (MIR). With the increasing engagement in feature learning and end-to-end learning in MIR, a robust dataset like FMA is essential for advancing research.

Dataset Overview

The Free Music Archive (FMA) dataset comprises 917 GiB of audio data spanning 106,574 tracks. These are released under Creative Commons licenses, ensuring open accessibility. The dataset is organized into a hierarchical taxonomy of 161 genres, making it a versatile resource for a myriad of MIR tasks. It includes a rich array of metadata such as artist information, track tags, and user data, alongside pre-computed audio features, which are crucial for developing and evaluating MIR algorithms.

Comparative Analysis

The FMA dataset is benchmarked against existing audio datasets, like GTZAN and the Million Song Dataset (MSD), where it stands out by providing both quality and permissibly licensed audio. Unlike other large datasets that restrict access to audio or provide only precomputed features, FMA offers full-length, high-quality tracks, facilitating in-depth feature extraction and exploration of novel end-to-end learning architectures.

Subsets and Splits

To accommodate different levels of computational resources, the dataset offers various subsets, such as Small, Medium, Large, and Full. These subsets vary in terms of the number of clips and their respective genre scopes. Furthermore, the paper proposes a standardized training, validation, and test split, ensuring the reproducibility of experiments and enabling robust benchmarking.

Potential MIR Applications

The paper outlines several MIR applications for the FMA dataset, including music classification, annotation, and genre recognition. For genre recognition, varying levels of challenge are introduced, from single-label prediction in a balanced subset to multi-label and multi-genre predictions on the full dataset. Baseline performance metrics highlight the dataset's utility but emphasize that improvements are attainable using advanced techniques.

Methodological Implications

The availability of high-quality audio allows for the exploration of deep learning techniques directly on waveforms, bypassing traditional feature extraction bottlenecks. This is particularly pertinent given the stagnation in certain MIR tasks, as highlighted by MIREX evaluations. The FMA dataset thus opens avenues for the development of sophisticated models capable of processing raw audio data effectively.

Conclusion and Future Directions

By offering a comprehensive, openly accessible resource, the FMA dataset fills a critical gap in MIR research. It enables the development and evaluation of algorithms under real-world conditions, providing a testbed for future research in genre recognition, recommendation systems, and beyond. Future work should focus on validating the dataset's annotations and enhancing it with additional metadata from crowd-sourced or external resources. This dataset is poised to play a pivotal role in broadening the scope and capabilities of MIR studies.

Overall, the introduction of the FMA dataset represents a substantial contribution to the field of MIR, offering a valuable tool for advancing both theoretical understanding and practical applications in music analysis.

Github Logo Streamline Icon: https://streamlinehq.com
Youtube Logo Streamline Icon: https://streamlinehq.com