- The paper introduces a unified five-layer architecture that integrates hardware, data, model, evaluation, and prototype components for comprehensive transportation research.
- It standardizes sensor data conversion and schema harmonization, reducing experimental setup time by 85% and cutting variability in evaluation metrics.
- The system enables digital-twin integration and simulation-driven studies, offering robust benchmarking and seamless transfer from research to real-world deployment.
Introduction and Motivation
Ozone addresses foundational barriers in transportation research arising from heterogeneous multimodal sensing, divergent data schemas, inconsistent evaluation protocols, and the lack of cross-dataset/model/evaluation integration. The system proposes a unified five-layer architecture—Hardware, Data, Model, Evaluation, Prototype—with standardized schemas and automated interfaces, targeting reproducibility, cumulative benchmarking, and deployment-readiness. Unlike prior work that addresses only isolated aspects (e.g., data format normalization or benchmarking in simulation), Ozone systematically organizes the entire research lifecycle under a common, extensible framework.
Figure 1: Overall architecture of the Ozone platform, highlighting the hierarchical integration of Hardware, Data, Model, Evaluation, and Prototype layers via standardized schemas and interfaces.
Layered System Design and Standardization
Hardware Layer
Ozone abstracts the hardware layer to capture both physical sensors (e.g., roadside/UAV cameras, LiDAR, radar, EEG, and eye tracking) and high-fidelity simulation platforms. The schema records detailed metadata (type, mounting geometry, output specification, reference frames) to enable principled fusion and ensure downstream calibration and synchronization. This resolves critical issues in spatial-temporal data alignment prevalent in legacy transportation datasets where sensor provenance, coordinate anchoring, and sampling cadence are undefined or inconsistent.
Data Layer
The data layer defines a canonical, extensible schema supporting three data categories: (i) temporally indexed, spatially consistent multi-agent trajectory data, (ii) event data with conflict/crash context, and (iii) environmental/contextual data for scene understanding. Automated conversion pipelines ingest widely-used datasets (NGSIM, highD, CitySim, UTE, etc.), perform geometric/kinematic inference to complete missing metadata (e.g., OBB corners, heading, speed, acceleration), and guarantee analytic equivalence and reproducibility across sources.
Ozone’s trajectory schema is both minimal (unambiguous assignment of global unique identifiers, spatial coordinates, kinematic fields) and complete (compatible with both classical and neural models). The inclusion of risky-event data with pre-computed surrogate safety measures (TTC, PET, DRAC, etc.) eliminates inter-study discrepancies in event definition or computation pipelines—an ongoing source of evaluation noise in safety analysis.
Model and Benchmarking Layers
The model layer features standardized input/output schemas, enforcing compatibility with Ozone’s data standards. The system provides a registry with baseline implementations: calibrated microscopic and macroscopic traffic-flow models, car-following and lane-changing models (IDM, MOBIL, Newell), and neural architectures (RNNs, GNNs, transformer-based prediction models). Foundation world models and risk assessment models are natively supported via interfaces for autoregressive and diffusion-based generation, with cross-domain data/modal integration.
The evaluation layer replicates best practices from A/B testing and ML benchmarking: fixed dataset splits, version-controlled metrics, standardized evaluation hooks, and a public repository of baseline results. All major domains (trajectory prediction, traffic-flow modeling, safety analysis) are covered with suites of industry-accepted metrics (ADE/FDE, RMSE, SSMs) and fixed protocol definitions, ensuring direct and fair model comparison across studies and even across research groups.
Digital-Twin Integration and Prototype Layer
A distinctive feature of Ozone is the alignment of all datasets to digital-twin CARLA maps, allowing scene replay, closed-loop simulation, and counterfactual scenario generation within a unified spatial framework. This enables seamless transfer from data-centric research (e.g., extraction of fine-grained driving interaction from video) to simulation-driven studies (e.g., scenario-based AV evaluation, human-in-the-loop studies).
The prototype layer specifies deployment interfaces for hardware (e.g., AVs, RSUs) and software systems (e.g., safety monitoring, city-scale dashboards), with traceability from research metrics to field validation criteria (performance, safety thresholds, operationalization protocols). This layer operationalizes Ozone’s core goal—cumulative research that is deployment-ready by construction.
Quantitative Results
Ozone achieves a mean positional error <0.15 m, heading error <2.5°, and speed error <0.3 m/s for converted trajectory data, indicating negligible loss from automated schema conversion. Schema completeness jumps from 45-82% pre-conversion to 100% post-conversion. Experiment setup time is reduced by 85% (from 12.4 to 1.8 days) as measured in a controlled user study, due to pipeline automation and cross-dataset schema harmonization.
For safety modeling, Ozone supports 91% cross-city transfer efficiency, measured as F1 (conflict classification) and AUC (crash-risk prediction) retention between training in one city and testing in another. Cross-dataset reproducibility is enhanced, reducing model performance variance (e.g., in trajectory-prediction studies) to below 3%, compared to pre-Ozone workflows in which 8–22% discrepancies were observed due to preprocessing and schema ambiguities. The system scales efficiently (processing 2.3 TB of source data per release in 4–8 computational hours), is modular for new sensor/model/domain integration, and maintains version-locked extensibility.
Use Cases and Empirical Case Studies
Ozone is demonstrated in human-factors research, scenario-based AV evaluation, and safety modeling:
- AR-HUD Risk Highlighting for Takeover in L3 Automation: GAT-LSTM-driven attention modeling and AR interface, evaluated in CitySim-derived simulated scenarios, reduces lateral stability σ by 22.02% (p=0.014), longitudinal σ by 6.47% (p=0.048), and increases takeover success by 8% (p=0.023), setting quantitative design baselines for driver-assist interfaces.
- Structural Long-Tail Risk Benchmark (RoadTailBench): The system enables automated generation and parameterization of 125 complex, compound-defect road scenarios (urban, highway, etc.). These are compatible with CARLA/OpenDRIVE/OpenSCENARIO and support both static/dynamic interference, with coverage for all high-impact engineering defects. AV planning robustness is evaluated under a high-density ODD risk matrix, with LLM-based agents supporting scenario cross-derivation.
Implications and Prospects
Ozone’s unification of data, model, metric, and deployment standards in a modular, open-source platform establishes a new research foundation for ITS and AV systems. The elimination of data/model interface friction, combined with reproducible, version-controlled benchmarking protocols, supports robust multi-city and cross-domain validation. The digital-twin-based simulation layer bridges the data-simulation-deployment gap—enabling both scalable automation (closed-loop pre-deployment testing) and human-factors experimentation.
Future research directions include:
- Direct integration of real-time V2X and RSU data streams.
- Embedding large-scale pre-trained world and safety foundation models for few-shot/zero-shot domain transfer.
- Systematic geographic expansion of the digital-twin environment library.
- Incorporation of formal verification pipelines for AV assurance at the evaluation/prototype layers.
- Community-driven, version-controlled extension and curation of data, models, and benchmark protocols.
Conclusion
Ozone constitutes a comprehensive, scalable framework for transportation research by resolving fragmentation in data, modeling, and evaluation across the full experimental and deployment lifecycle. By formalizing the research process into interoperable, automatable layers and providing reproducibility guarantees, Ozone enables efficient, transferable, and methodologically robust studies in traffic modeling, autonomous vehicle evaluation, and human-machine interaction domains. The platform is poised to become the de facto standard for cumulative traffic research, facilitating robust progress from data acquisition to real-world deployment scenarios.
(2604.10959)