Monado SLAM Dataset: VR & Robotics Benchmark
- Monado SLAM Dataset is a comprehensive real-world dataset that delineates head-mounted VIO/SLAM challenges across VR, MR, and robotics applications.
- It provides synchronized, multi-modal sensor data including high-resolution imagery, IMU, and magnetometer readings with precise 6-DoF ground-truth.
- The dataset rigorously benchmarks system performance under high-intensity motion, dynamic occlusions, extreme lighting, and textureless conditions.
The Monado SLAM Dataset is a publicly released, real-world dataset designed to probe and expose the limitations of current visual–inertial odometry (VIO) and simultaneous localization and mapping (SLAM) systems in egocentric, head-mounted settings. Specifically, it focuses on the unique and often underrepresented challenges found in mixed reality (MR), virtual reality (VR), and humanoid robotics applications, including high-intensity head motion, dynamic occlusions, adverse lighting, and sensor saturation in a variety of realistic scenarios (Mayo et al., 31 Jul 2025).
1. Composition and Structure
The Monado SLAM Dataset comprises 64 non‐calibration sequences spanning over 5 hours and 15 minutes, sourced from consumer VR headsets:
- Valve Index, including high-frequency, high-resolution, dual forward-facing cameras plus IMU.
- HP Reverb G2, with four portrait-oriented cameras (providing overlapping/non-overlapping views) and IMU.
- Samsung Odyssey+, with two luma cameras, IMU, and 50 Hz magnetometer data.
Data for each headset is segmented into “Calibration” sequences (for stereo and IMU extrinsics/intrinsics) and challenge-oriented sequences, with a distinct “Playing” category for the Valve Index covering gameplay (e.g., Beat Saber, Pistol Whip) that systematically induces high-intensity motion and recurrent rapid occlusions.
All recordings are provided in a format closely following the EuRoC ASL conventions. Each file includes synchronized timestamps, USB packet arrival data, and dense 6-DoF pose ground-truth, established via the Lighthouse external tracking system with ~1 cm accuracy.
2. Sensor Modalities, Calibration, and Data Integrity
Each headset configuration represents typical VR/AR hardware, providing high-precision IMU data (accelerometer, gyroscope, and—where available—magnetometer) synchronized with camera streams. The dataset includes the following features:
- Synchronized high-resolution forward-view imagery.
- Accurate per-frame pose ground-truth.
- Dense timestamping to facilitate real-time replay and causality experiments.
The calibration pipeline involves:
- Stereo camera calibration to determine full intrinsics and relative camera poses, adhering to equidistant and radial-tangential distortion models.
- Visual–IMU calibration, using dynamic motion segments to precisely estimate relative pose, misalignment, and scale between each camera and the IMU (adopting calibration routines as exemplified by Basalt).
- Accurate estimation of sensor timestamp offsets (through gyroscope and visual odometry alignment), followed by re-timestamping and alignment of all sensor data.
Researchers are provided with not only the raw recordings but also pre-calibrated data versions and conversion utilities for compatibility with middlewares such as ROS.
3. Coverage of Challenging Scenarios
The dataset prioritizes scenarios known to degrade or defeat state-of-the-art VIO/SLAM algorithms:
- Erratic and high-acceleration motion (as seen in VR gameplay), stressing compensatory mechanisms and latency.
- Hand occlusions and rapid scene changes, breaking feature correspondences and potentially causing tracking failure.
- Lighting extremes: overexposure, low light, and abrupt illumination changes introduce sensor noise and saturation.
- Textureless environments, which reduce the efficacy of purely visual feature tracking methods.
- Sensor pathologies, including IMU saturation from sudden head jerks.
Sequences are drawn directly from naturalistic VR/AR device use, ensuring practical relevance for both XR headsets and mobile robotics research.
4. Benchmarking Methodology and Evaluation Metrics
The Monado SLAM Dataset is intended as a rigorous testbed for end-to-end benchmarking and algorithmic evaluation. It adopts well-established pose evaluation conventions:
- Absolute Trajectory Error (ATE):
where and denote (ground truth, estimate), both in SE(3).
- Relative Trajectory Error (RTE):
where with evaluation interval frames.
Benchmarks are provided for several systems, including Basalt, OKVIS2, ORB-SLAM3, DM-VIO, and SnakeSLAM. The dataset exposes cases where even causal, state-of-the-art pipelines yield median ATE above 5 cm under realistic egocentric conditions, demonstrating current limitations in real-time head-mounted tracking.
5. Application Domains and Use Cases
The dataset targets multiple high-impact research and deployment settings:
- XR tracking: Directly relevant to AR, VR, and MR headsets, the Monado SLAM Dataset enables systematic evaluation of tracking fidelity, drift, and robustness during naturalistic user interaction.
- Humanoid and mobile robotics: Provides representative trajectories and sensor fusion configurations for real-world, dynamic navigation in human-centric environments.
- Multi-sensor fusion research: Diverse camera and IMU arrangements (2–4 cameras, IMU, magnetometer) support the development and stress-testing of advanced multi-modal SLAM backends.
- Algorithm comparison and validation: The dataset’s organization and calibration fidelity enable reproducible, head-to-head algorithm comparisons—including in the ROS ecosystem.
6. Accessibility, Licensing, and Research Impact
The Monado SLAM Dataset is released under the permissive Creative Commons Attribution 4.0 (CC BY 4.0) license, enabling unrestricted use, modification, and redistribution (subject to attribution). The repository includes raw and calibrated data, documentation, and conversion scripts, and is hosted on Hugging Face (https://huggingface.co/datasets/collabora/monado-slam-datasets).
This licensing and infrastructure explicitly facilitate both academic and industrial adoption. The dataset directly addresses a gap in head-mounted XR device benchmarking—where prior datasets focused on AR or generic robotics contexts—by providing the first real-world, multi-device VR sequences with challenging ground-truth.
7. Role in Advancing VIO/SLAM Research
The Monado SLAM Dataset advances the research frontier by:
- Enabling robust benchmarking for egocentric, head-mounted, and XR-relevant VIO/SLAM.
- Highlighting failure cases not covered in existing datasets, guiding algorithmic development.
- Supporting evaluation of calibration methods, synchronization, and real-time robustness.
- Offering tools and documentation that improve reproducibility and community engagement.
Previous research notes that realistic and difficult egocentric scenarios—such as those in MSD—are underrepresented, causing VIO/SLAM algorithms to be overfitted to academic benchmarks with less adversarial conditions. The Monado SLAM Dataset thus fills a critical deficiency and is poised to shape the development of resilient, accurate trackers for next-generation mixed reality and robotics applications (Mayo et al., 31 Jul 2025).