SUMO Traffic Microsimulator Overview
- Traffic Microsimulator SUMO is an open-source platform that simulates individual agents using advanced car-following, lane-changing, and signal-control models.
- Advanced derivatives like QarSUMO leverage meta-parallelism and virtual grouping to improve scalability and reduce simulation wall-clock time.
- SUMO underpins urban planning and reinforcement learning research by enabling high-fidelity, agent-based simulations of complex, real-world traffic networks.
The Simulation of Urban MObility (SUMO) traffic microsimulator is a high-performance, open-source platform designed for the detailed, time-accurate simulation of individual agents (vehicles, bicycles, pedestrians) moving over arbitrary road networks. SUMO implements state-of-the-art car-following, lane-change, and signal-control models. It is a core tool in transportation research, urban planning, reinforcement learning for traffic management, evaluation of connected/autonomous vehicles, and digital-twin development. SUMO and its advanced parallel derivatives such as QarSUMO have become de facto computational substrates for both city-scale macroscopic analyses and agent-level traffic experiments.
1. Core Architecture and Simulation Loop
SUMO operates on a microscopic paradigm in which every agent in the network is explicitly simulated on a discrete time axis (default s). The fundamental network data structures comprise:
- Junctions (nodes): Optionally endowed with traffic-light logic.
- Directed Edges: Each partitioned into lanes, where each lane holds a dynamic array of vehicles.
- Vehicle objects: Each vehicle stores instantaneous position, speed , acceleration , desired speed , and a routing plan.
At each simulation step , SUMO processes all vehicles using:
- PlanMove: Per-vehicle longitudinal dynamics via the Car-Following Model (CFM):
- SetJunctionApproaches: Marking vehicles approaching junctions for signal and priority handling.
- ExecuteMovement: Position and speed updates, junction conflict resolution.
- ChangeLanes: Lateral movement via Lane-Changing Models (LCM), potentially:
$\mathds{1}_{>0}(i,t+1) = g\bigl(a_{i,t}, a_{i-1,t}, a_{i+1,t}\bigr)$
Driver randomness (improper reactions, random slowdowns) can be injected but is typically disabled for precise experiments.
2. High-Level Parallelization in QarSUMO
QarSUMO introduces meta-parallelism into SUMO by wrapping unmodified SUMO instances in an MPI/C++ supervisor harness, using Libsumo bindings for inter-process orchestration. The key steps are:
- Network Vertex Partitioning: The junction graph is partitioned via METIS into load-balanced subsets. Border junctions are duplicated (“primary” vs “shadow” node copies) for state consistency. A traffic-aware partitioning is supported:
where is expected edge flow and is edge length.
- Process Layout and Communication: Each partition runs a SUMO process. At every timestep end, partitions:
- Extract and pack state data for vehicles on cut (border) edges.
- Exchange updates via MPI_Alltoall.
- Insert/update vehicles crossing partitions or as shadow vehicles.
- Shadow Vehicle Semantics: Vehicle state is redundantly simulated near partition borders, with shadow copies ensuring upstream traffic influences propagate correctly. Shadow vehicles are expunged only when their paths diverge from the relevant partition.
Parallel Scalability
- Empirical speedups: On a 32-core EC2 instance and for real/synthetic networks, achieved 1.98–23.05 speedup (Grid), 5.20–14.60 (Cologne), and 1.98–5.70 (Corniche) compared to baseline SUMO.
- Border-edge ratio and communication overhead tracked: e.g., Corniche increases from 0.43% (2 parts) to 8.00% (32 parts).
- Accuracy impacts are minor, with relative trip time errors (Corniche), (Grid).
3. Congestion-Optimized Simulation in QarSUMO
QarSUMO introduces a congestion-modeled computational speedup by grouping stop-and-go vehicles:
- Virtual Grouping Algorithm:
- Each lane is divided into: an exit zone (10% or max 50 m), and group zones (default ).
- For each zone , if
( lane speed limit, fully stopped), form a “congested” group. - The group leader executes the CFM/LCM logic; followers inherit leader kinematics, suppressing per-vehicle computation.
- Group Dissolution: Triggered if leader speed resumes above threshold or enters an exit zone, immediately re-partitioning the lane.
Measured results:
- 1.93 (Corniche) and 2.26 (Grid) simulation time speedup under heavy congestion.
- Trip time error remains within 2.98% (Corniche), 6.22% (Grid); traveled distance error /.
4. Performance, Scaling, and Accuracy
QarSUMO’s overall performance depends on both the degree of network regularity and traffic congestion properties:
| Network | Max Speedup (32 parts) | Max Border % | Max MPI MB | Max Trip-time error (%) |
|---|---|---|---|---|
| Grid | 23.05× | 11.02 | 828 | 1.94 |
| Cologne | 14.60× | — | — | — |
| Corniche | 5.70× | 8.00 | 828 | 5.46 |
Combined (meta-parallel + grouping): 1-hour Corniche run: baseline 1h, QarSUMO-parallel 0.31h, QarSUMO 0.22h ( overall acceleration).
Total parallel efficiency is ultimately limited by:
- Load imbalance on irregular/small networks, saturating beyond 8–16 partitions.
- Border shadow simulation, which introduces $0.5$ s latency per border crossing.
- Communication grows with number of partitions and vehicle count (e.g., message size 27 MB to 828 MB), but MPI time fraction typically falls as vehicle count rises.
5. Compatibility, Extendability, and Deployment
- Kernel Compatibility: QarSUMO operates via unmodified SUMO binaries, with all parallel logic externalized in the MPI/C++ driver using Libsumo.
- Forward compatibility: All improvements to SUMO’s core (e.g., multi-threaded simulation kernels) are inherited by QarSUMO automatically.
- Multi-node scaling: The use of MPI meta-parallelism makes cluster/cloud deployment and city-scale simulation routine, pending network partitioning and load-balance.
- Future directions: Prospective advances include dynamic load rebalancing, spatially/temporally adaptive step-sizes, and advanced shadow-vehicle synchronization mechanisms for enhanced fidelity.
6. Limitations and Open Research Questions
QarSUMO’s design introduces quantifiable trade-offs:
- Parallel speedup limiters: Non-uniform partition load (especially on real-world, small or irregular subnetworks like Corniche) creates diminishing returns at high core counts.
- Shadow vehicle latency: Border-handling can accumulate small, systematic errors over long simulation periods, though the observed empirical impact is 6% even at 32 partitions.
- Congestion-grouping drift: Under certain conditions, e.g., fluctuating stop-and-go patterns, temporally misgrouped vehicles may yield sub-second unphysical behavior. Further drift and error minimization is an active topic.
- No in-situ parallelization of SUMO kernel: Speedup relies on meta-parallel partition logic; “true” per-agent multi-threading in the core would surpass this architecture if/when available.
7. Implications for Reinforcement Learning and Urban Computing
By drastically reducing simulation wall-clock time—especially for large, congested, or RL-intensive workloads—QarSUMO enables:
- Accelerated policy optimization: RL can routinely run tens/hundreds of thousands of episodes on large networks within feasible timeframes; trip/route losses remain within strict acceptability bounds.
- Integration into digital twins: Near-real-time online simulation for large regions is now practical.
- Scalability for city-scale synthetic and real data experiments: Selective tuning of communication/computation split and partition granularity allows practitioners to optimize throughput for batch vs. interactive regimes.
QarSUMO establishes a structural template for parallel microsimulation compatible with ongoing/anticipated improvements to base SUMO and provides a robust foundation for urban-scale, real-time, and RL-driven traffic research (Chen et al., 2020).