CARLA-Round Dataset
- CARLA-Round is a simulation dataset for roundabout trajectory prediction that isolates environmental and traffic factors through 25 controlled scenarios.
- It comprises 100,000 trajectories annotated at 10 Hz, enabling benchmarking via metrics like ADE and FDE across varied weather and density conditions.
- Baseline models using LSTM, GCN, and GRU+GCN, along with sim-to-real validations, highlight the dataset's capability to drive robust autonomous vehicle research.
CARLA-Round is a systematically constructed simulation dataset designed for roundabout vehicle trajectory prediction research. It specifically addresses the scarcity of multimodal, controlled, and realistically structured trajectory data for roundabout scenarios, where confounders such as entangled environmental and traffic factors often hinder isolation and quantification of their respective impacts. By factorizing weather and traffic density, CARLA-Round enables ablation studies, benchmarking, and development of robust prediction algorithms for autonomous vehicle systems operating in complex, unsignalized roundabout contexts (Zhou et al., 17 Jan 2026).
1. Structured Scenario Design and Experimental Factors
CARLA-Round employs the CARLA simulator to generate 25 distinct “controlled” scenarios by Cartesian-producting five canonical weather conditions and five traffic density levels, yielding an orthogonal matrix of factors that are rarely disentangled in real-world datasets. Each scenario is instantiated on a single-lane, four-entry/exit roundabout map, with vehicles spawning at each entry according to a Poisson arrival process with rate parameters (λ) matched to desired Level-of-Service (LoS) grades.
Weather Conditions:
- ClearNoon: bright daylight, no clouds.
- CloudyNoon: overcast, high ambient light.
- WetNoon (Rain): light/moderate rainfall, wet pavement.
- WetSunset: light rain, low-angle sun.
- Fog: dense fog, ~30 m visibility.
Environmental states are governed through controlled precipitation rates, fog densities, solar angles, and cloud cover.
Traffic Density Levels (Highway Capacity Manual LoS):
| Level | λ [veh/min] | Average N in roundabout |
|---|---|---|
| A | 5 | ~5 |
| B | 10 | ~10 |
| C | 15 | ~15 |
| D | 20 | ~20 |
| E | 25 | ~25 |
Each of the 25 (weather, density) cells produces 4,000 vehicle trajectories (1,000 per entry/exit), yielding a total corpus of 100,000 trajectories with explicit factor annotations (Zhou et al., 17 Jan 2026).
2. Annotation Schema and Signal Content
CARLA-Round provides temporally dense (10 Hz) annotations, supporting comprehensive modeling and evaluation:
- agent_id: integer, globally unique per session.
- frame_id, timestamp: discrete, in seconds.
- position : in global CARLA world coordinates (meters).
- velocity : m/s.
- acceleration : m/s.
- heading : radians from X-axis.
- bounding_box: (meters), orientation.
- intent: categorical .
- Coordinate System: right-handed 2D (Z ignored), metric units.
- Temporal Windows: For each sample, frames (2 s) history, frames (3 s) prediction horizon, total window of 5 seconds per agent instance.
3. Task Formulation and Evaluation Metrics
The central prediction task is:
- Given , , predict .
Performance is assessed by:
- Average Displacement Error (ADE):
- Final Displacement Error (FDE):
These metrics are consistent with established practice in vehicle trajectory prediction literature.
4. Dataset Statistics and Splitting Protocol
CARLA-Round comprises 100,000 trajectories, distributed as follows:
| Split | Number of Trajectories | Per Scenario |
|---|---|---|
| Train | 70,000 | 2,800 |
| Validation | 15,000 | 600 |
| Test | 15,000 | 600 |
Splits are stratified within each (weather, density) cell, ensuring balanced coverage for experimental fidelity. Each trajectory is metadata-tagged, enabling targeted analyses across environmental and operational axes.
5. Baseline Architectures and Experimental Results
Benchmark experiments evaluate standard neural and graph-based models:
- LSTM: Per-agent encoder–decoder, input features , no social pooling.
- GCN: Social graph via adjacency based on Euclidean distance (< 10 m), node features , 2-layer GCN + MLP predictor.
- GRU+GCN: Sequence encoding (GRU) followed by inter-agent modeling (2-layer GCN), standard decoder.
Training regime:
- Hidden size 64, learning rate 0.001 (Adam), batch size 128, 50 epochs, early stopping on validation ADE.
Performance Table (ADE/FDE by Density, averaged over weather):
| LoS Density | LSTM | GCN | GRU+GCN |
|---|---|---|---|
| A | 0.12/0.23 | 0.10/0.20 | 0.09/0.18 |
| B | 0.15/0.28 | 0.12/0.25 | 0.11/0.22 |
| C | 0.20/0.35 | 0.17/0.30 | 0.15/0.27 |
| D | 0.26/0.42 | 0.22/0.36 | 0.20/0.32 |
| E | 0.33/0.50 | 0.28/0.45 | 0.25/0.40 |
Traffic density manifests a strong monotonic effect on prediction difficulty. Weather effects on error are non-monotonic; Fog yields the greatest error ( m ADE for GRU+GCN), ClearNoon the smallest ( m ADE).
6. Sim-to-Real Validation
Sim-to-real transfer is validated by fine-tuning the GRU+GCN architecture, pretrained on CARLA-Round, on the real-world drone-based rounD dataset (Krajewski et al. 2020):
- Initial ADE (no fine-tuning): 0.35 m on rounD.
- ADE after fine-tuning on 2,000 real trajectories: 0.312 m.
This demonstrates effective transfer from the structured simulation to real-world deployment, consistent with the structured isolation of environmental and operational confounders (Zhou et al., 17 Jan 2026).
7. Access, Usage, and Reproducibility
CARLA-Round and accompanying codebase are hosted at https://github.com/Rebecca689/CARLA-Round.
Download and Scripted Use
1 2 |
git clone https://github.com/Rebecca689/CARLA-Round.git cd CARLA-Round && bash download_data.sh |
1 2 3 4 5 6 7 8 9 |
from carla_round import RoundDataset ds = RoundDataset(root_dir='/path/to/data', split='train', history_frames=20, pred_frames=30) for traj, meta in ds: # traj: Tensor shape (N_agents, T+H, 2) # meta: dict with agent_ids, intents, weather, density_level pass |
The dataset supports rigorous ablation, benchmarking, and development for roundabout trajectory prediction research, where exhaustive real-world observation is infeasible (Zhou et al., 17 Jan 2026).