Mixed Autonomy Traffic Simulation

Updated 4 July 2026

Mixed autonomy traffic simulation is a modeling framework that integrates autonomous, connected, and human-driven vehicles with distinct sensing, communication, and control capabilities.
It employs multi-scale methods—from microscopic POMDPs and decentralized control to macroscopic PDE formulations—to capture complex vehicle interactions in diverse traffic scenarios.
Simulation studies demonstrate practical improvements in safety, throughput, and fuel efficiency, with reported metrics showing up to 90% delay reductions and significant energy savings.

Mixed autonomy traffic simulation denotes the modeling, control, and evaluation of traffic systems in which autonomous vehicles, connected and automated vehicles, connected human-driven vehicles, and conventional human-driven vehicles coexist under heterogeneous sensing, communication, and actuation capabilities. The topic spans microscopic vehicle-interaction models, mesoscopic and macroscopic flow models, dynamic traffic assignment, and closed-loop policy learning, and it is motivated by the fact that partial control and observation, multi-vehicle interactions, and the variety of real-world networks make near-term autonomy impacts difficult to analyze with simple rule-based simulators alone (Rahmani et al., 14 Apr 2026, Wu et al., 2017).

1. Conceptual scope and traffic entities

A central feature of mixed-autonomy simulation is that different vehicle classes are not merely assigned different parameters; they are often embedded in different information and control channels. In one signalized urban-intersection formulation, autonomous vehicles (AVs) are connected and controllable, and AVs that become lead vehicles in their lane are told exactly when they may enter the merge zone, whereas human-driven vehicles (HDVs) cannot receive timing commands and are controlled using conventional traffic lights (Ghosh et al., 2021). In an unsignalized intersection formulation, the vehicle classes are connected and automated vehicles (CAVs), connected human-driven vehicles (CHVs), and un-connected human-driven vehicles (HVs), with different right-of-way and execution assumptions (Zhou et al., 2022).

The term “mixed traffic” is sometimes used in a broader, lane-free sense. The two-dimensional Mixed Traffic Model treats vehicles including bikes, rickshaws, cars, and AVs as self-driven particles moving on a two-dimensional plane of width $W_{\rm road}$ , with longitudinal and lateral dynamics coupled through interaction forces (Kanagaraj et al., 2018). This differs from lane-based microsimulation but addresses a setting in which drivers do not follow lane discipline and interactions resemble disordered self-driven particle systems. A plausible implication is that mixed-autonomy traffic simulation is not confined to standard freeway and intersection geometry; it also includes shared-space and weak-lane-discipline traffic states when those states are relevant to deployment contexts.

Mixed autonomy can also be institutional rather than purely kinematic. In the hybrid traffic-laws study, a roadway contains HDVs, CAVs, and buses, and the control problem is to decide whether a vehicle may enter a restricted lane via static or dynamic access rules (Kraicer et al., 18 Feb 2025). In that setting, mixed-autonomy simulation includes regulatory asymmetry: CAVs obey centrally broadcast rules instantly, whereas HDVs follow normal SUMO rules. This suggests that the subject includes not only behavior modeling and control synthesis but also the simulation of differentiated legal or operational regimes.

2. Modeling scales and mathematical formalisms

At the microscopic level, mixed-autonomy traffic is frequently cast as a partially observed sequential decision problem. Flow formulates mixed-autonomy traffic control as an episodic POMDP,

$\bigl(\mathcal S,\mathcal A,\mathcal P,\mathcal O,\mathcal O\!b,\rho_0,\gamma,T\bigr),$

with state variables given by all vehicles’ longitudinal positions and velocities, actions given by the accelerations commanded to the autonomous vehicles, human-driven vehicles evolving under the IDM, and rewards based on average velocity minus an acceleration penalty (Wu et al., 2017). For intersections with multiple controlled vehicles, the formulation is often decentralized: one reinforcement-learning study models mixed autonomy as a decentralized partially observable MDP with a factorized joint policy

$\pi_\theta(s)=\prod_i \pi^i_\theta(o^i),$

where each AV uses local, intersection-centric observations and outputs a discrete acceleration (Yan et al., 2021).

Microscopic formulations also include explicit vehicle-dynamics and interaction models beyond longitudinal car-following. The two-dimensional Mixed Traffic Model decomposes acceleration into self-driving, vehicle-vehicle interactions, and boundary forces, and it collapses to the original one-dimensional car-following model when the lateral gaps vanish while recovering an anisotropic social-force-type limit at very low speeds (Kanagaraj et al., 2018). At urban junctions, a decentralized bi-level framework uses a nonlinear kinematic bicycle model with state $\mathbf{x}=[x,y,\theta,v]^T$ and control $\mathbf{u}=[a,\delta]^T$ , then combines graph-search reference generation with local MPC tracking (Rahmani et al., 29 Jul 2025).

At the macroscopic and mesoscopic levels, mixed-autonomy simulation introduces class-specific flow representations. One platooning framework couples an LWR-type PDE for traffic density with an ODE for the downstream end of a CAV platoon, thereby treating the platoon as a moving bottleneck whose trajectory influences cell flux in a Cell-Transmission discretization (Zou et al., 2024). Another study models HV and AV densities and speeds through an extended Aw-Rascle-Zhang formulation consisting of coupled $4\times4$ hyperbolic PDEs, with ramp metering entering as the boundary actuation mechanism (Zhang et al., 17 Nov 2025). At the urban-network scale, an extended multi-commodity store-and-forward model tracks HDVs and CAVs on each link and cycle, with a dynamic saturation rate

$s_{(z,k)}=\frac{X_{(z,k)}}{\phi_{(z,k)}}$

that depends on the local queue composition through CAV and HDV discharge headways (Haris et al., 8 May 2026).

Network-wide equilibrium formulations add route-choice heterogeneity to mixed-autonomy simulation. A multiclass simulation-based dynamic traffic assignment model assumes that CAVs follow system optimal principles with rerouting capability while HDVs follow user equilibrium principles, with link costs read from SUMO and surrogate marginal travel times used for CAV routing (Mehrabani et al., 2023). A later equilibrium framework for ride-hailing mixed fleets formulates interactions among travelers, traffic, and representative ride-hailing companies as a nonlinear complementarity problem equivalent to a variational inequality, with capped waiting time and distinct AV/HV behavior during pickup and service stages (Hou et al., 11 Dec 2025).

3. Control, coordination, and policy synthesis

A major branch of the literature treats mixed-autonomy traffic simulation as an online scheduling and control problem. For a four-way urban intersection, one optimization-based framework indexes each vehicle by lane and position, introduces binary precedence variables over vehicles on conflicting lanes, and minimizes a composite objective that combines energy-cost increase, sum of individual delays, merge-entry speed loss, and total intersection clear-out time:

$\Phi(b)=w_1\cdot(\text{energy\_cost\_increase})+w_2\cdot(\text{sum\_of\_individual\_delays})+w_3\cdot\sum_{i,j}(U_{\max}-v_{i,j}(t_{i,j}^{in}))+w_4\cdot\max_{i,j}t_{i,j}^{out}.$

Because full resequencing is intractable online, the method relaxes the original problem by resequencing a newly arrived vehicle only among vehicles in the conflicting lanes; the resulting algorithm has worst-case $O(n)$ complexity per arrival and finds the exact minimizer of the relaxed problem (Ghosh et al., 2021).

Unsignalized intersection work often combines heuristic right-of-way allocation with low-level tracking control. The heuristic priority queues based right-of-way allocation algorithm maintains per-lane heaps and grants row based on arrival order, conflict structure, and vehicle type, permitting a CAV to have at most one conflict with a granted vehicle while requiring CHVs to have zero conflicts. The lower level then assigns one of four control modes—car-following, cruise, waiting, or conflict-solving—and executes longitudinal control with MPC (Zhou et al., 2022). This architecture explicitly separates admission logic from trajectory execution.

Control-barrier-function formulations emphasize safety and recursive feasibility. In one signalized-intersection controller with right turn on red permitted, both CAVs and HDVs are modeled as longitudinal double integrators, HDVs follow IDM, and each CAV solves a one-dimensional quadratic program that minimizes deviation from an LQR reference while enforcing control limits, speed bounds, rear-end safety, and crossing-time constraints derived from smart traffic-light intervals (Tzortzoglou et al., 17 Jun 2025). The paper shows that the intersection of all control-bounds sets is always nonempty, so the QP remains solvable at every time step.

Decentralized urban-junction models remove central controllability altogether. A bi-level framework generates kinematically feasible reference trajectories using multicriteria A* over motion primitives and tracks them with a convex QP-based MPC at $10$ Hz, while a parallel collision-avoidance module predicts other vehicles using constant velocity and steering and checks dual-circle overlaps inside a detection range (Rahmani et al., 29 Jul 2025). The framework does not require central controllability or knowledge sharing among vehicles.

Learning-based approaches replace explicit sequencing or hand-designed rules with learned coordination. In mixed-autonomy intersections, a shared decentralized policy trained with on-policy policy gradients and local observations learns to coordinate vehicles into signal-like alternating platoons without reward shaping (Yan et al., 2021). In Flow, deep RL policies with only local observation improve system-level velocity in ring, merge, and figure-eight scenarios (Wu et al., 2017). More recent work extends the learning paradigm to language-mediated cooperation: CoMAL equips each CAV with a Perception Module, Memory Module, Collaboration Module, Reasoning Engine, and Execution Module, uses round-robin textual discussion to agree on roles such as leader and follower, and executes decisions by tuning IDM parameters $\bigl(\mathcal S,\mathcal A,\mathcal P,\mathcal O,\mathcal O\!b,\rho_0,\gamma,T\bigr),$ 0 rather than directly commanding arbitrary control inputs (Yao et al., 2024). The reported comparison is mixed rather than uniform: CoMAL outperforms pure IDM but underperforms optimally-trained multi-agent RL on Merge, while yielding more robust role-based cooperation than MARL on Figure-Eight (Yao et al., 2024).

4. Simulation environments, software stacks, and benchmark design

SUMO remains the dominant microscopic backbone in the literature, but it is used in markedly different ways. Flow exposes SUMO through a modular gym-style interface and composes scenarios from reusable module types such as network topology, actors, observer, control law, dynamics, metrics, and initialization (Wu et al., 2017). Reinforcement learning for mixed-autonomy intersections uses SUMO with a simulation step $\bigl(\mathcal S,\mathcal A,\mathcal P,\mathcal O,\mathcal O\!b,\rho_0,\gamma,T\bigr),$ 1 s, warmup $\bigl(\mathcal S,\mathcal A,\mathcal P,\mathcal O,\mathcal O\!b,\rho_0,\gamma,T\bigr),$ 2 steps, and horizon $\bigl(\mathcal S,\mathcal A,\mathcal P,\mathcal O,\mathcal O\!b,\rho_0,\gamma,T\bigr),$ 3 steps, in two-way and four-way intersection grids composed of single-lane roads (Yan et al., 2021). CoMAL also uses Flow wrapped around SUMO, but its benchmark scenarios are ring, figure-eight, and merge networks with $\bigl(\mathcal S,\mathcal A,\mathcal P,\mathcal O,\mathcal O\!b,\rho_0,\gamma,T\bigr),$ 4 s simulator steps and disabled lateral lane changes (Yao et al., 2024).

Scenario construction varies by geometry, inflow process, and control scope. Urban-intersection optimization uses four straight-through lanes with control zones of length $\bigl(\mathcal S,\mathcal A,\mathcal P,\mathcal O,\mathcal O\!b,\rho_0,\gamma,T\bigr),$ 5 and a merging zone of length $\bigl(\mathcal S,\mathcal A,\mathcal P,\mathcal O,\mathcal O\!b,\rho_0,\gamma,T\bigr),$ 6, with independent Poisson arrivals in each lane and no turns or lane changes inside control or merge zones (Ghosh et al., 2021). The mixed-autonomy intersection RL benchmark instead uses open boundaries with fixed, deterministic inflow rates $\bigl(\mathcal S,\mathcal A,\mathcal P,\mathcal O,\mathcal O\!b,\rho_0,\gamma,T\bigr),$ 7 across sixteen demand configurations from $\bigl(\mathcal S,\mathcal A,\mathcal P,\mathcal O,\mathcal O\!b,\rho_0,\gamma,T\bigr),$ 8 to $\bigl(\mathcal S,\mathcal A,\mathcal P,\mathcal O,\mathcal O\!b,\rho_0,\gamma,T\bigr),$ 9 veh/hr/lane (Yan et al., 2021). The hybrid traffic-laws study considers a $\pi_\theta(s)=\prod_i \pi^i_\theta(o^i),$ 0 km corridor with a restricted lane upstream of a bottleneck and compares a $\pi_\theta(s)=\prod_i \pi^i_\theta(o^i),$ 1-hour non-stationary Daily_k profile with a one-hour Constant_3000 Poisson-arrival case (Kraicer et al., 18 Feb 2025).

Highway and network benchmarks introduce additional scales. The Dyna-style platooning study models a $\pi_\theta(s)=\prod_i \pi^i_\theta(o^i),$ 2 m roadway with a three-lane to two-lane bottleneck in SUMO, discretizes it into $\pi_\theta(s)=\prod_i \pi^i_\theta(o^i),$ 3 cells, and injects a $\pi_\theta(s)=\prod_i \pi^i_\theta(o^i),$ 4 m CAV platoon at $\pi_\theta(s)=\prod_i \pi^i_\theta(o^i),$ 5 s (Zou et al., 2024). The MILP-based urban-network controller uses Aimsun Next through an API on a $\pi_\theta(s)=\prod_i \pi^i_\theta(o^i),$ 6 grid with $\pi_\theta(s)=\prod_i \pi^i_\theta(o^i),$ 7 single-lane links and $\pi_\theta(s)=\prod_i \pi^i_\theta(o^i),$ 8 signalized intersections, a cycle time of $\pi_\theta(s)=\prod_i \pi^i_\theta(o^i),$ 9 s, and mixed CAV/HDV split scenarios (Haris et al., 8 May 2026). The long-duration-autonomy CBF study validates its controller through extensive simulations in MATLAB for a four-way signalized intersection with dedicated left-turn lanes and a $\mathbf{x}=[x,y,\theta,v]^T$ 0 m traffic-light region (Tzortzoglou et al., 17 Jun 2025).

Benchmarking increasingly draws on trajectory datasets as well as classical simulators. The survey identifies NGSIM, highD, inD, rounD, pNEUMA, and exiD as public drone-trajectory datasets, and Waymo Open Motion, Argoverse 1 and 2, nuScenes, Lyft Level 5, INTERACTION, and nuPlan as broader motion-forecasting or closed-loop planning datasets (Rahmani et al., 14 Apr 2026). DRIFT calibrates HV priors from highD, rounD, inD, and exiD, then evaluates closed-loop mixed-autonomy generation in Flow plus SUMO on ring road, figure-eight, and open-merge scenarios across AV penetration levels $\mathbf{x}=[x,y,\theta,v]^T$ 1 (Yu et al., 15 Jun 2026).

5. Evaluation protocols and empirical findings

The evaluation literature distinguishes open-loop prediction metrics from closed-loop traffic metrics. Open-loop metrics include ADE, FDE, multi-modal minADE $\mathbf{x}=[x,y,\theta,v]^T$ 2, minFDE $\mathbf{x}=[x,y,\theta,v]^T$ 3, miss rate, negative log-likelihood, and kinematic feasibility. Closed-loop metrics include collision rate, post-encroachment time, route completion, goal achievement, jerk, lateral-acceleration exceedance, rule-compliance rates, and composite closed-loop scores (Rahmani et al., 14 Apr 2026). Mixed-autonomy traffic-control papers add domain-specific operational metrics such as total intersection clear-out time, mean and max individual delay, mean entry speed, fuel proxies, average passenger delay, total mean queue, aggregate travel time, cumulative outflow, hard-braking counts, THW/TTC violations, and computation time per control update or vehicle arrival (Ghosh et al., 2021, Kraicer et al., 18 Feb 2025, Haris et al., 8 May 2026, Yu et al., 15 Jun 2026).

At intersections, partial autonomy can already produce large gains. In the optimization-based urban-intersection study, numerical evaluation over $\mathbf{x}=[x,y,\theta,v]^T$ 4 runs and $\mathbf{x}=[x,y,\theta,v]^T$ 5 hr simulations shows up to $\mathbf{x}=[x,y,\theta,v]^T$ 6 reduction in total clear-out time at $\mathbf{x}=[x,y,\theta,v]^T$ 7 AV penetration and still approximately $\mathbf{x}=[x,y,\theta,v]^T$ 8 reduction at $\mathbf{x}=[x,y,\theta,v]^T$ 9 AV, mean delay dropping from hundreds of seconds under FIFO to tens of seconds under the proposed method even at $\mathbf{u}=[a,\delta]^T$ 0 AV, max delay reduced by more than $\mathbf{u}=[a,\delta]^T$ 1 in all tested mixes, and total-time loss below $\mathbf{u}=[a,\delta]^T$ 2 under $\mathbf{u}=[a,\delta]^T$ 3 m position-estimation noise (Ghosh et al., 2021). The mixed-autonomy intersection RL study reports that, in the two-way $\mathbf{u}=[a,\delta]^T$ 4 scenario, $\mathbf{u}=[a,\delta]^T$ 5 AV achieves $\mathbf{u}=[a,\delta]^T$ 6– $\mathbf{u}=[a,\delta]^T$ 7 of Oracle outflow across all $\mathbf{u}=[a,\delta]^T$ 8 inflow configurations, while in the four-way $\mathbf{u}=[a,\delta]^T$ 9 scenario $4\times4$ 0 AV achieves approximately $4\times4$ 1– $4\times4$ 2 Oracle, supporting the claim that best performance occurs at $4\times4$ 3– $4\times4$ 4 AV penetration (Yan et al., 2021).

In ring, merge, and figure-eight settings, learned controllers also improve aggregate speed with low AV share. Flow reports average-velocity improvements of $4\times4$ 5 in a single-lane ring with $4\times4$ 6 AV among $4\times4$ 7 vehicles, $4\times4$ 8 in a multi-lane ring with $4\times4$ 9 AVs among $s_{(z,k)}=\frac{X_{(z,k)}}{\phi_{(z,k)}}$ 0 vehicles, and $s_{(z,k)}=\frac{X_{(z,k)}}{\phi_{(z,k)}}$ 1 in a figure-eight intersection with $s_{(z,k)}=\frac{X_{(z,k)}}{\phi_{(z,k)}}$ 2 AV among $s_{(z,k)}=\frac{X_{(z,k)}}{\phi_{(z,k)}}$ 3 vehicles (Wu et al., 2017). CoMAL’s reported gains are scenario-dependent: on Flow benchmarks, average speed improvements reach $s_{(z,k)}=\frac{X_{(z,k)}}{\phi_{(z,k)}}$ 4 in Merge4 relative to pure human driving, while speed standard deviation reductions exceed $s_{(z,k)}=\frac{X_{(z,k)}}{\phi_{(z,k)}}$ 5 in several Figure-Eight and Ring settings (Yao et al., 2024).

Fuel and congestion effects are prominent in highway and network studies. In the Dyna-style platooning framework, the Dyna-Q policy achieves $s_{(z,k)}=\frac{X_{(z,k)}}{\phi_{(z,k)}}$ 6 L total fuel consumption versus $s_{(z,k)}=\frac{X_{(z,k)}}{\phi_{(z,k)}}$ 7 L under the Krauss benchmark, corresponding to a $s_{(z,k)}=\frac{X_{(z,k)}}{\phi_{(z,k)}}$ 8 reduction, while Dyna-Q converges to a stable high-reward policy in approximately $s_{(z,k)}=\frac{X_{(z,k)}}{\phi_{(z,k)}}$ 9 k steps and the baseline DQN fails to converge even after $\Phi(b)=w_1\cdot(\text{energy\_cost\_increase})+w_2\cdot(\text{sum\_of\_individual\_delays})+w_3\cdot\sum_{i,j}(U_{\max}-v_{i,j}(t_{i,j}^{in}))+w_4\cdot\max_{i,j}t_{i,j}^{out}.$ 0 k steps (Zou et al., 2024). In the hybrid traffic-laws study, under Daily_3 demand at $\Phi(b)=w_1\cdot(\text{energy\_cost\_increase})+w_2\cdot(\text{sum\_of\_individual\_delays})+w_3\cdot\sum_{i,j}(U_{\max}-v_{i,j}(t_{i,j}^{in}))+w_4\cdot\max_{i,j}t_{i,j}^{out}.$ 1 CAV penetration, CAVDynamic_24 reduces average passenger delay from $\Phi(b)=w_1\cdot(\text{energy\_cost\_increase})+w_2\cdot(\text{sum\_of\_individual\_delays})+w_3\cdot\sum_{i,j}(U_{\max}-v_{i,j}(t_{i,j}^{in}))+w_4\cdot\max_{i,j}t_{i,j}^{out}.$ 2 s under DBL to $\Phi(b)=w_1\cdot(\text{energy\_cost\_increase})+w_2\cdot(\text{sum\_of\_individual\_delays})+w_3\cdot\sum_{i,j}(U_{\max}-v_{i,j}(t_{i,j}^{in}))+w_4\cdot\max_{i,j}t_{i,j}^{out}.$ 3 s, a reduction greater than $\Phi(b)=w_1\cdot(\text{energy\_cost\_increase})+w_2\cdot(\text{sum\_of\_individual\_delays})+w_3\cdot\sum_{i,j}(U_{\max}-v_{i,j}(t_{i,j}^{in}))+w_4\cdot\max_{i,j}t_{i,j}^{out}.$ 4 (Kraicer et al., 18 Feb 2025). In the MILP-based urban-network controller, DynamicSF with horizon $\Phi(b)=w_1\cdot(\text{energy\_cost\_increase})+w_2\cdot(\text{sum\_of\_individual\_delays})+w_3\cdot\sum_{i,j}(U_{\max}-v_{i,j}(t_{i,j}^{in}))+w_4\cdot\max_{i,j}t_{i,j}^{out}.$ 5 yields TMQ $\Phi(b)=w_1\cdot(\text{energy\_cost\_increase})+w_2\cdot(\text{sum\_of\_individual\_delays})+w_3\cdot\sum_{i,j}(U_{\max}-v_{i,j}(t_{i,j}^{in}))+w_4\cdot\max_{i,j}t_{i,j}^{out}.$ 6 veh, ATT $\Phi(b)=w_1\cdot(\text{energy\_cost\_increase})+w_2\cdot(\text{sum\_of\_individual\_delays})+w_3\cdot\sum_{i,j}(U_{\max}-v_{i,j}(t_{i,j}^{in}))+w_4\cdot\max_{i,j}t_{i,j}^{out}.$ 7 h, and Delay $\Phi(b)=w_1\cdot(\text{energy\_cost\_increase})+w_2\cdot(\text{sum\_of\_individual\_delays})+w_3\cdot\sum_{i,j}(U_{\max}-v_{i,j}(t_{i,j}^{in}))+w_4\cdot\max_{i,j}t_{i,j}^{out}.$ 8 s/km, compared with ConstantSF at TMQ $\Phi(b)=w_1\cdot(\text{energy\_cost\_increase})+w_2\cdot(\text{sum\_of\_individual\_delays})+w_3\cdot\sum_{i,j}(U_{\max}-v_{i,j}(t_{i,j}^{in}))+w_4\cdot\max_{i,j}t_{i,j}^{out}.$ 9 veh, ATT $O(n)$ 0 h, Delay $O(n)$ 1 s/km, and FixedTime at TMQ $O(n)$ 2 veh, ATT $O(n)$ 3 h, Delay $O(n)$ 4 s/km (Haris et al., 8 May 2026).

Safety, stability, and computational burden are increasingly evaluated jointly. The event-triggered PDE controller stabilizes mixed-autonomy freeway flow while reducing controller updates by $O(n)$ 5– $O(n)$ 6 relative to continuous backstepping, and it yields up to $O(n)$ 7 total delay reduction in the non-recurrent case (Zhang et al., 17 Nov 2025). DRIFT reports near-best efficiency while maintaining zero collisions, minimal THW/TTC violations, and substantially fewer hard brakes than learning-based and car-following baselines; in Merge at $O(n)$ 8 AV, the reported outflow is $O(n)$ 9 veh/h with only $10$0 hard-braking events, compared with $10$1 for FollowerStopper (Yu et al., 15 Jun 2026). These results do not establish a single dominant methodology, but they do show that mixed-autonomy simulation has moved beyond throughput-only benchmarking toward multi-objective evaluation of safety, efficiency, stability, and online executability.

6. Limitations, misconceptions, and research directions

A recurrent misconception is that mixed-autonomy traffic simulation is equivalent to evaluating a fully autonomous fleet. Multiple results directly contradict that view. Flow reports system-level velocity gains with only $10$2–$10$3 AV adoption in congested corridors (Wu et al., 2017), the optimization-based intersection controller shows large delay reductions at $10$4 AV penetration (Ghosh et al., 2021), and the mixed-autonomy intersection RL study finds near-optimal throughput at $10$5–$10$6 controlled vehicles rather than at $10$7 (Yan et al., 2021). The literature therefore treats partial adoption not as a nuisance parameter but as a primary operating regime.

Another misconception is that mixed-autonomy simulation is now fully realistic. Many studies intentionally simplify communication, sensing, or geometry. The unsignalized intersection management strategy assumes ideal V2I/V2V with no delay or loss (Zhou et al., 2022). The long-duration-autonomy controller assumes perfect V2I and does not model explicit process or measurement noise (Tzortzoglou et al., 17 Jun 2025). The decentralized urban-junction model includes detection range and reaction delay but uses no explicit probabilistic uncertainty model and no driver heterogeneity model (Rahmani et al., 29 Jul 2025). CoMAL relies on textual prompts with no explicit learned embedding or sensor-noise model and disables lateral lane changes in its benchmark environments (Yao et al., 2024). The MILP-based urban-network controller notes limitations including the fixed-cycle assumption, simplified headway averaging, and abstraction of stochastic driver behaviors (Haris et al., 8 May 2026). These are not defects in themselves; they identify which mechanisms are being isolated and which are postponed.

Methodological controversy persists over the trade-off between tractability and fidelity. The intersection resequencing framework explicitly sacrifices full global optimality but retains the exact optimum within the one-lane-pair resequencing neighborhood, with an empirical gap to full resequencing below $10$8 (Ghosh et al., 2021). The survey frames a broader version of the same issue as a scalability–fidelity–controllability trilemma, alongside a causality gap, an evaluation crisis, mixed-autonomy representation challenges, deployment gaps, data limitations and geographic bias, and the need for unified architectures that combine world models, reactive agents, and scenario generation (Rahmani et al., 14 Apr 2026). This suggests that the field is less about selecting a universally best simulator than about matching modeling assumptions, control authority, and evaluation criteria to a specified deployment question.

Current research directions increasingly emphasize closed-loop realism and heterogeneous interaction structure. DRIFT points toward executability-constrained trajectory generation with online candidate selection and risk-aware long-tail feedback (Yu et al., 15 Jun 2026). The decentralized junction framework suggests integration with SUMO, AIMSUN, or VISSIM via external APIs and calibration against real-world trajectory datasets (Rahmani et al., 29 Jul 2025). The long-duration-autonomy study identifies multi-lane and turn-sequence optimization and adaptive signal control as natural extensions (Tzortzoglou et al., 17 Jun 2025). The survey adds simulation-to-real transfer, cognitive integration, and unified evaluation protocols as core open problems (Rahmani et al., 14 Apr 2026). In aggregate, these directions indicate that mixed-autonomy traffic simulation is evolving from isolated controller testing toward an integrated discipline of behavioral modeling, closed-loop scenario generation, safety diagnostics, and network-level policy analysis.