DeepSense-6G Scenario-33: Multimodal V2I Dataset

Updated 10 January 2026

DeepSense-6G Scenario-33 Dataset is a comprehensive multimodal collection that synchronizes mmWave, radar, LiDAR, camera, and GPS data for V2I applications.
The dataset is designed to benchmark beam prediction performance with features like Top-3 accuracy >70% using multimodal fusion techniques.
Researchers benefit from its precise sensor synchronization, complete calibration protocols, and clear training, validation, and testing splits for 6G ISAC studies.

The DeepSense-6G Scenario-33 Dataset is a multi-modal vehicle-to-infrastructure (V2I) sensing and communication database, specifically designed to enable and benchmark beam prediction and multimodal fusion in millimeter-wave (mmWave) wireless networks. Scenario-33 captures synchronized measurements from a roadside base station observing a passing vehicle, aggregating sensor streams including 60 GHz mmWave phased-array power vectors, FMCW radar, RGB stereo camera, 3D LiDAR, and GPS-RTK, with exhaustive time-matched annotations of optimal beam selection at the infrastructure receiver. This dataset supports research in 6G integrated sensing and communication (ISAC), low-latency beamforming, multimodal deep learning, and cross-sensor fusion (Demirhan et al., 2021, Alkhateeb et al., 2022, Orimogunje et al., 3 Jan 2026).

1. Physical Environment and Deployment

Scenario-33 is centered on a V2I road segment in an urban setting, characterized by nighttime measurement conditions to capture unique ambient illumination and occlusion events (Alkhateeb et al., 2022). The base station (BS, Unit 1) is positioned at lamp-post height (approximately 3 m), adjacent to a two-lane straight asphalt road. The mmWave receiver array, radar, and other sensors are co-located on a single mast, with precise geometric separation (≲10 cm). The Cartesian reference frame origin is at the radar phase center; x-axis aligns with array boresight, y-axis points rightward. The mobile transmitter (Unit 2) is mounted within a passenger vehicle executing lateral passes at distances R ∈ [5 m, 25 m], followed by road turns. Base station coverage spans ±45° azimuth, ±15° elevation, with sensor line-of-sight prevailing for most samples but transient blockages (vehicle/pedestrian) are present.

2. Sensor Modalities and Hardware Specifications

All sensors are hardware-synchronized via GNSS PPS and FPGA reference clocks, permitting millisecond-scale temporal alignment. Below is a summary of sensing modalities and critical parameters:

Modality	Hardware/Specs	Sampling Rate
mmWave Rx array	60 GHz, 16-element ULA, 64 beams	10–20 Hz
RGB stereo camera	ZED2, 1080×720/540×960, 30 fps	30 Hz
3D LiDAR	Ouster OS1-32, 32×1024, 120 m	20 Hz
FMCW radar	TI AWR1843, 76–81 GHz, 4Rx, 4Tx	20 Hz
GPS-RTK	SparkFun; centimeter horizontal	10 Hz

mmWave receiver: phased array covers ±45° azimuth sector using 64 DFT-style analog-combined beams, with approximately 1 GHz RF bandwidth.
Camera: stereo RGB with hardware timestamps and global shutter, 110° horizontal FoV.
LiDAR: rotating mirror producing 128,000 points per scan.
Radar: 4 GHz sweep bandwidth, 256 chirps per frame, range resolution ≈ 0.2 m.
GPS-RTK: provides ground-truth trajectory data, used both for scenario filtering and precision annotation.

3. Data Collection, Synchronization, and Structure

Each measurement instance logs synchronized sensor outputs indexed to a master UTC timestamp. Synchronization is enforced via hardware triggers post-mmWave beam sweep:

mmWave: exhaustive sweep across the 64-element codebook; received power vector stored.
Camera: triggered exposure aligns with mmWave frame; rectified images.
LiDAR: azimuthal scan aligned and registered using T_LiDAR→BS calibration.
Radar: raw I/Q samples (A×S×C), processed to range–angle (RA) and range–velocity (RV) magnitude maps via 2D FFTs.
GPS: WGS84 log, extrinsically transformed to BS-aligned Cartesian (consistent with T_bs→world).

Directory structure follows strict modality separation, with raw and processed subtrees and millisecond-resolved file naming:

Scenario33/
  raw/
    mmwave/
    camera/
    lidar/
    radar/
    gps/
  processed/
    aligned_lidar/
    aligned_radar/
    beams/
  sync_index.csv
  metadata.json

File types comprise raw binary (I/Q, beams), JPEG (images), PCD (LiDAR), and CSV/HDF5 (numerics, labels). Each timestamped sample logs all sensor streams; sync_index.csv provides the master inclusion key.

4. Signal Models, Calibration, and Feature Extraction

Beam indices are annotated using an oracle rule:

$b_t^* = \arg\max_{i \in \{1,\dots,64\}}~ p_{t,i}$

where $p_{t,i}$ is the beamformed receive power for the $i$ th codebook vector $w_i$ :

$z_{t,i} = w_i^H y_t,\quad p_{t,i} = |z_{t,i}|^2$

Camera images are rectified using intrinsic matrix $K$ and extrinsics $T_{\text{cam}\to \text{BS}}$ from checkerboard calibration. LiDAR clouds $L_t = \{q_\ell \in \mathbb{R}^3\}_{\ell=1}^{N_L}$ are transformed using $T_{\text{LiDAR}\to \text{BS}}$ . Radar samples undergo FFT-based transformation:

Range–Angle heatmap $R_l = \sum_{a=0}^{A-1} |\mathrm{FFT}_\text{angle}\{\mathrm{FFT}_\text{range}\{X^{\text{raw}}_l[:,:,a]\}\}|^2$
Range–Velocity $V_l = \sum_{m=1}^{M_r} |\mathrm{FFT}_\text{vel}\{\mathrm{FFT}_\text{range}\{X^{\text{raw}}_l[m,:,:]\}\}|^2$ Ground-truth positions enable filtering and coordinate alignment. All non-mmWave modalities are reduced to input tensors through denoising (LiDAR: statistical radius filtering, radar: background subtraction, camera: exposure normalization).

5. Dataset Composition, Splits, and Labels

Scenario-33 provides approximately $O(10^4)$ synchronized instances. Each sample contains:

mmWave power vector: $64 \times 1$ float, optimal beam index $\{1,\dots,64\}$ , and received power in dBm
RGB camera image (e.g., $1080 \times 720 \times 3$ UInt8)
LiDAR point cloud ( $N \times 4$ float32 array)
Radar I/Q ( $M \times 4$ float32, $256$ chirps)
GPS coordinates ( $\text{lat}, \text{lon}, \text{alt}$ float64)

Splits are conventionally set at 70% train, 15% validation, 15% test, consistently in scenario-specific usage (Demirhan et al., 2021, Alkhateeb et al., 2022, Orimogunje et al., 3 Jan 2026). Object detection and bounding boxes (YOLOv3) are optionally provided for vision-fusion tasks, labeled in JSON format.

6. Benchmarking Tasks and Usage Guidelines

Scenario-33 has established itself as a core resource for 6G beam prediction and ISAC evaluations. Key research tasks supported include:

Beam prediction: classification of the optimal mmWave receive-beam from multimodal sensor fusion, with Top-3 accuracy >70% using exteroceptive inputs; mmWave alone and mmWave+LiDAR/GPS/Radar achieve up to 98% Top-5, mmWave+camera up to 94% Top-5 (Orimogunje et al., 3 Jan 2026).
User positioning: mmWave angle-of-arrival fused with LiDAR maps to refine vehicle location.
Blockage/obstruction identification: transient radar cross-section analysis.
Transfer learning: train/validate generalization from daytime scenarios (31, 32) to Scenario-33 (night). Performance metrics utilize Top- $k$ accuracy, spectral efficiency gap, SNR gap, rate loss, and end-to-end latency.

7. Data Access, Extrinsic Metadata, and Calibration

Scenario-33 is available for download via https://deepsense6g.net/ under “Scenarios,” “Testbed 5,” “Scenario 33.” The ≈25 GB archive provides all raw and processed data, with metadata.json furnishing scenario IDs, sensor specs, and calibration matrices required for coordinate transformations and modality alignment. All files follow strict timestamp and sensor-filename conventions; ongoing updates and post-processing scripts are maintained in the official repository (Alkhateeb et al., 2022).

The DeepSense-6G Scenario-33 dataset constitutes a comprehensive, precisely time-synchronized foundation for multimodal beam prediction and 6G ISAC, supplying research-grade signal, calibration, and annotation conventions. All coordinate systems, sensor interfaces, and task protocols are exhaustively documented as per the dataset papers and code portals (Demirhan et al., 2021, Alkhateeb et al., 2022, Orimogunje et al., 3 Jan 2026).