MoisesDB Multitrack Music Dataset

Updated 18 December 2025

MoisesDB is a comprehensive multitrack dataset with hierarchically organized raw audio tracks, supporting granular music source separation.
The dataset employs a two-level taxonomy to group audio tracks into meaningful stems, facilitating both aggregate and fine-grained source separation experiments.
MoisesDB offers a Python API and benchmark evaluations using oracle methods and neural models, enabling detailed performance analysis via SDR and related metrics.

MoisesDB is a large-scale, publicly available multitrack dataset developed to advance music source separation research beyond the traditional four-stem paradigm. Comprising 240 stereo songs from 47 distinct artists and spanning 12 high-level genres, MoisesDB addresses the data scarcity that historically limited separation systems to vocals, drums, bass, and "other" stems. Each song includes individually recorded raw audio tracks organized into a two-level hierarchical stem taxonomy, facilitating fine-grained, configurable separation strategies and enabling research at increased stem granularities (Pereira et al., 2023).

1. Dataset Composition

MoisesDB contains 240 stereo songs with a total duration of approximately 14 hours, 24 minutes, and 46 seconds. The dataset encompasses 47 distinct artists across 12 genres including, but not limited to, Rock, Pop, Jazz, Electronic, and Folk. The genre distribution exhibits a power-law characteristic, with a small number of genres accounting for the majority of tracks.

Each song comprises "raw" audio tracks (e.g., snare drum, acoustic guitar, cello), which are semantically grouped into stems per the designated taxonomy. The number of stems per track ranges from three to ten, reflecting the diversity of instrumentation and production styles. "Vocals," "drums," and "bass" stems are present in nearly all songs. In contrast, stems such as "wind" and "other plucked" are comparatively rare. This natural imbalance reproduces real-world catalog characteristics, providing a realistic and challenging environment for source separation systems.

2. Hierarchical Stem Taxonomy

MoisesDB implements a two-level hierarchical taxonomy to group its raw audio tracks into musically and operationally meaningful stems. There are 11 top-level stems, each further subdivided into specific sub-stems. This structure mirrors the workflow of practical mixing sessions and enables both granular and aggregate source separation experiments.

Top-Level Stem	Selected Sub-Stems
Bass	Bass Guitar, Bass Synthesizer, Contrabass
Bowed Strings	Cello, Cello Section, String Section, Viola Solo
Drums	Cymbals, Drum Machine, Kick Drum, Snare Drum, Toms
Guitar	Acoustic Guitar, Clean Electric, Distorted Electric
Other	Fx
Other Keys	Organ, Electric Organ, Synth Lead, Synth Pad
Other Plucked	Banjo/Mandolin/Ukulele/Harp
Percussion	Pitched Percussion, A-Tonal Percussion
Piano	Electric Piano, Grand Piano
Vocals	Lead Female Singer, Lead Male Singer, Background
Wind	Brass, Flutes, Reeds, Other Wind

The taxonomy's granularity supports separation models capable of distinguishing, for example, between "guitar" and "other plucked" sources, and offers a realistic testbed for error analysis across related classes.

3. Data Access and Usage

MoisesDB is distributed with an accompanying Python package available from the Python Package Index (PyPI), which handles metadata parsing, stem construction, mixing, and I/O. Installation is performed via:

1	pip install moisesdb

The API enables downloading, inspecting, and processing tracks:

from moisesdb.dataset import MoisesDB

db = MoisesDB(data_path='./moises-db-data', download=True)
print(f"Total tracks available: {len(db)}")
track0 = db[0]
mix = track0.audio           # numpy array: (2, samples)
stems = track0.stems         # e.g. {'vocals': array, 'drums': array, …}
track0.save_stems('my_output/track_000')

Preprocessing for machine learning pipelines involves stacking stem sources into tensors or computing short-time Fourier transforms (STFTs) dynamically. The dataset structure supports rapid prototyping for various separation tasks.

4. Baseline Performance and Evaluation Metrics

MoisesDB includes benchmark performance results using three oracle methods—Ideal Binary Mask (IBM), Ideal Ratio Mask (IRM), and Multichannel Wiener Filter (MWF)—as well as two open-source neural architectures: HT-Demucs and Spleeter. The primary evaluation metric is Source-to-Distortion Ratio (SDR):

$\mathrm{SDR} = 10\,\log_{10} \frac{\sum_n |s(n)|^2 + \epsilon} {\sum_n |s(n) - \hat s(n)|^2 + \epsilon}$

where $s(n)$ is the reference signal, $\hat s(n)$ the estimate, and $\epsilon$ a small constant; additional metrics SIR and SAR are also cited for comprehensive evaluation.

Representative SDR values (mean ± standard deviation, with median in parentheses) for primary settings:

Stems	Model/Oracle	SDR (dB)	N
4 (voc, dr, bs, o)	HT-Demucs	9.91 ± 3.27 (9.69)	235
	Spleeter	6.29 ± 2.47 (6.24)
	IBM	7.14 ± 2.28 (6.99)
	IRM	8.97 ± 2.16 (8.81)
	MWF	9.08 ± 2.15 (8.87)
5 (+ piano)	Spleeter	4.66 ± 3.20 (5.02)	104
	IBM	5.12 ± 2.81 (4.87)
	IRM	7.65 ± 2.66 (7.60)
	MWF	7.81 ± 2.66 (7.83)
6 (+ guitar)	HT-Demucs	6.24 ± 5.17 (6.05)	88
	IBM	5.12 ± 2.81 (4.87)
	IRM	6.91 ± 2.70 (6.69)
	MWF	7.06 ± 2.73 (6.89)

Notably, HT-Demucs outperforms some oracle-based upper bounds (e.g., IBM) on bass and drums in both 4- and 6-stem configurations, highlighting the advancement of neural separation models.

5. Dataset Analysis and Characteristics

MoisesDB mirrors real-world catalog imbalances: a subset of genres (e.g., pop/rock) and artists are overrepresented. Almost all tracks include "vocals," "drums," and "bass," with underrepresented classes such as "wind" and "other plucked" providing opportunities to evaluate rare-class separation.

Track durations have a mean of 3 minutes 36 seconds (standard deviation 66 seconds). The dataset utilizes unmastered mixes, which exhibit lower loudness levels (approximately –15 LUFS) and higher dynamic ranges compared to typical commercial releases. This suggests that models trained on MoisesDB may need adaptation strategies for deployment on mastered audio.

6. Research and Application Scenarios

MoisesDB enables diverse research directions:

Fine-grained source separation: Models can be trained and evaluated on granularities ranging from three to ten stems, moving beyond canonical four-stem paradigms.
On-the-fly data augmentation: By grouping raw tracks into user-defined stems, novel mixtures and training examples can be synthesized programmatically.
Error analysis: The hierarchical taxonomy allows for targeted confusion analysis between close instrument families (e.g., "guitar" vs. "other plucked").
Cross-task integration: Separated stems can be used for downstream music information retrieval (MIR) tasks including chord estimation, melody extraction, and educational applications such as karaoke or play-along track creation.
Domain adaptation: Investigations into the impact of unmastered data, as well as hybrid training with mastered datasets, are facilitated by the exhaustive recording conditions.

MoisesDB's combination of scale, taxonomic detail, and public availability positions it as a valuable foundation for next-generation source separation system development and evaluation (Pereira et al., 2023).

Markdown Upgrade to Chat

References (1)

Moisesdb: A dataset for source separation beyond 4-stems (2023)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to MoisesDb Dataset.

MoisesDB Multitrack Music Dataset

1. Dataset Composition

2. Hierarchical Stem Taxonomy

3. Data Access and Usage

4. Baseline Performance and Evaluation Metrics

5. Dataset Analysis and Characteristics

6. Research and Application Scenarios

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research

MoisesDB Multitrack Music Dataset

1. Dataset Composition

2. Hierarchical Stem Taxonomy

3. Data Access and Usage

4. Baseline Performance and Evaluation Metrics

5. Dataset Analysis and Characteristics

6. Research and Application Scenarios

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research