Telematics-Captured Near-Miss Events

Updated 7 September 2025

Telematics-captured near-miss events are high-risk, non-collision incidents detected via multi-modal sensor data such as kinematic thresholds, video analysis, and ADAS alerts.
Methodologies combine sensor fusion, edge computing, and advanced statistical models (e.g., Bayesian and zero-inflated Poisson) to provide real-time risk assessment.
Applications span ADAS/AV testing, insurance ratemaking, and urban infrastructure planning by using NMEs as leading indicators of potential crash risk.

Telematics-captured near-miss events (NMEs) are data-driven representations of high-risk, non-collision traffic incidents inferred or directly measured via in-vehicle sensors, mobile devices, or connected vehicle platforms. By quantifying and characterizing narrowly avoided accidents, these events serve as leading indicators of crash risk, supporting proactive road safety management, advanced driver assistance system (ADAS) development, autonomous vehicle (AV) validation, insurance ratemaking, and urban infrastructure planning. The field brings together techniques in telematics, video analytics, signal processing, edge/cloud computation, Bayesian risk modeling, and spatial statistics, producing both foundational datasets and operational systems for near-miss identification, classification, and application.

1. Principles and Definitions of Telematics-Captured Near-Miss Events

Telematics-captured NMEs are defined by the direct or surrogate measurement of hazardous, but non-collision, traffic interactions detected via multi-modal sensor data streams. The detection and quantification of a near-miss is typically operationalized through either:

Kinematic thresholds, such as sudden decelerations (hard braking > 0.5 G (Kataoka et al., 2018); high-G IMU readings (Grigorev et al., 3 Jun 2025)) or proximity metrics (time-to-collision, separation distance, relative speed, or composite risk scores (Antonsson et al., 2022, Anis et al., 23 Jul 2024, Anis et al., 2 Sep 2025)).
Semantic and contextual video analysis (e.g., bounding box trajectories, object tracking and classification, scene segmentation distinguishing foreground vehicles/pedestrians/bicycles from background (Kataoka et al., 2018, Pradana et al., 2023, Zhang et al., 5 Dec 2024)).
Integration of driver assistance system warnings (e.g., forward collision alerts, lane departure, too-close signals) or ADAS-defined surrogate events (Zhang et al., 31 Aug 2025).

Standardized annotation protocols (e.g. as in NIDB (Kataoka et al., 2018), SynSHRP2 (Shi et al., 6 May 2025)) and definition of scenario classes by risk level, conflict type, and severity ensure comparability and facilitate downstream algorithmic development or statistical analysis.

2. Methodologies for NME Detection and Data Acquisition

Methods for capturing NMEs span from traditional vehicle-borne telematics to advanced edge/AI solutions:

In-vehicle telematics/manual selection: G-triggered dashcams (capture on >0.5 G deceleration, followed by expert labeling and TTC-based risk stratification (Kataoka et al., 2018, Chan et al., 2023)), smartphone-based sensor fusion (accelerometer/gyroscope/GPS, post-processed with heuristics or machine learning, e.g., SimRa and CycleSense (Karakaya et al., 2020, Karakaya et al., 2022)), embedded ADAS event logging (Zhang et al., 31 Aug 2025).
Edge computing with video analytics: Real-time deep-learning object detection (SSD-Inception), tracking (SORT), and linear regression estimation of TTC from bounding box sequences performed on embedded hardware (e.g., Jetson TX2 (Ke et al., 2020)), with selective transmission of validated near-crash event clips to the cloud, incorporating CAN-bus vehicle dynamics and geolocation.
High-frequency AV/Connected Vehicle Data: Continuous GPS and kinematic logging (e.g., Wejo, Waymo Open, Argoverse-2), processed via spatiotemporal buffer-querying, heading/intersecting radial geometry, and map-matching to infrastructural basemaps (Li et al., 17 Sep 2024, Anis et al., 23 Jul 2024, Anis et al., 2 Sep 2025).
Synthetic Datasets and Data Augmentation: Privacy-preserving synthetic benchmarks created by diffusion models on real event sequences (e.g. SynSHRP2 (Shi et al., 6 May 2025)), trajectory augmentation via VAE-based generative interpolation between safe and collision states (CMTS (Ding et al., 2019)), and conditional style manipulations for video data (Pradana et al., 2023).

Annotation and detection are further strengthened by semantic flow (foreground/background motion separation (Kataoka et al., 2018)) and continual improvement through deep neural networks with temporal and frequency-domain feature extraction (Karakaya et al., 2022, Zhang et al., 5 Dec 2024).

3. Quantitative Surrogate Safety Metrics and Statistical Modeling

A variety of kinematic and probabilistic metrics underlie the operationalization and risk quantification of NMEs:

Time-to-Collision (TTC) and 2D–TTC: High-fidelity 2D TTC computed from vehicle positions, headings, steering angles, and velocities with a kinematic bicycle or rigid rectangle geometry, capturing both longitudinal and lateral components (Anis et al., 23 Jul 2024, Anis et al., 2 Sep 2025).
Streetscope Hazard Measure (SHM): A continuous scalar hazard score defined as $m_2 = S_{\text{rel}}^2/d_{\text{sep}}$ , and its extensions, computed from telematics data for each conflict pair in time, produces a leading risk indicator (Antonsson et al., 2022).
Transition Matrix and Power Laws: State transition matrices in speed discretization, principal component analysis to reveal abrupt, risky transitions, and power law modeling for “learning effects” in driver NME frequency over exposure (Chan et al., 2023).
Zero-Inflated and Latent Group Models: NMEs are typically sparse and zero-inflated in telematic time series; thus, advanced count frameworks (zero-inflated Poisson, generalized Poisson, EM-estimated latent driver clusters) are used to obtain calibrated, interpretable risk predictions (Zhang et al., 31 Aug 2025).

These quantitative methods enable the systematic aggregation, exposure normalization, and risk stratification necessary for insurance applications, real-time feedback, and safety benchmarking.

4. Spatial-Temporal Analysis, Clustering, and Policy Implications

NMEs’ high event rate and precise localization (enabled by continuous telematics) support detailed spatial and temporal analyses:

Hotspot mapping: Exposure-normalized danger scores, as in SimRa (score = $(\alpha \cdot s + n)/r$ ) for cycling networks (Karakaya et al., 2020), and Getis-Ord $G_i^*$ statistics on uniform grids or road segments for identifying crash/NME spatial clusters (Grigorev et al., 3 Jun 2025, Li et al., 17 Sep 2024).
Bivariate LISA and Surrogate-Target Concordance: Joint spatial classification of High–High (HH), Low–High (LH), High–Low (HL), and Low–Low (LL) cells, distinguishing established blackspots, emerging risk areas, and their correspondence to POI densities or traffic controls (Grigorev et al., 3 Jun 2025).
Temporal patterning: Peaks in near-miss frequency during weekday rush hours and in high-traffic urban arterials, and distinct patterns on holidays or by road class (Li et al., 17 Sep 2024).
Statistical modeling: Binary logistic regression and hierarchical Bayesian GEV models quantify the influence of road geometry, vehicle types, and traffic volume on near-crash likelihood, accounting for site-specific heterogeneity (Li et al., 17 Sep 2024, Anis et al., 23 Jul 2024, Anis et al., 2 Sep 2025).

The spatial-temporal density of NMEs justifies their proactive use in real-time traffic management, infrastructure design, and dynamic intervention planning.

5. Dataset Benchmarks and Training Frameworks for Automated Systems

Several large-scale, openly described datasets and synthetic benchmarks underpin contemporary NME research:

Near-Miss Incident Database (NIDB): ~4,594 expert-annotated near-miss video clips from a decade of >100 taxi-mounted dashcams, stratified by risk and participant type (bicycle, pedestrian, vehicle) (Kataoka et al., 2018).
SynSHRP2: Over 1,874 crashes and 6,924 near-crashes generated via privacy-preserving synthetic imagery and de-identified, multimodal time-series sensor data (IMU, CAN, environmental context) (Shi et al., 6 May 2025).
Cyc-CP: Combined synthetic (CARLA) and real-world (VOC) cycling datasets for close-pass NME detection, offering scene/instance-level benchmarks with spatio-temporal and lateral vehicle distance ground truths (Li et al., 2023).
Simulation environments and adversarial training frameworks: CARLA-based reinforcement learning schemes for generating and adapting NMEs to advance AV robustness, using reward schemes sensitive to near-miss gradients and dynamic adversarial strategies (RARL, SAC) (Yang et al., 5 Jun 2024).

These resources enable methodological innovation, robust model evaluation, and consistent cross-paper comparisons.

6. Applications in Insurance, Traffic Safety, Urban Policy, and AV/ADAS

Telematics-captured NMEs are leveraged for a broad range of applied goals:

Insurance ratemaking and UBI: NMEs, as leading indicators, enable fine-grained, exposure-adjusted risk assessments. Group-based zero-inflated Poisson models and principal component-derived dynamic features outperform claim-based regressions and inform fairer, context-aware premiums (Chan et al., 2023, Zhang et al., 31 Aug 2025).
Proactive traffic safety management: High temporal and spatial frequency of NMEs allows the identification of emergent blackspots before crash accumulation, guiding early interventions ("emerging risk" LH areas (Grigorev et al., 3 Jun 2025)) and precision public awareness campaigns (Li et al., 17 Sep 2024).
ADAS/AV training, testing, and evaluation: Real-time, telematics-derived NMEs serve as corner-case data pools for training advanced detection, classification, and prediction systems. Their inclusion improves scenario diversity, boosts rare event generalizability, and supports the robustness of AV/ADAS pipelines (Kataoka et al., 2018, Yang et al., 5 Jun 2024).
Urban planning and road design: NME-based analytics inform the prioritization of redesign, addition of traffic controls, and infrastructure upgrades, validated by exposure-normalized and contextually ranked spatial analyses (Grigorev et al., 3 Jun 2025, Karakaya et al., 2020).

7. Future Directions and Research Challenges

Ongoing research is characterized by:

Advancements in geometry-aware indicators: Adoption of fully 2D–TTC, and multivariate extreme value theory frameworks, with integration of vehicle-infrastructure interactions and high-order kinematic variables (Anis et al., 2 Sep 2025, Anis et al., 23 Jul 2024).
Improvements in real-time, edge-based detection and privacy preservation: Efficient, scalable processing on embedded devices and synthetic data generation via deep generative models address practical deployment constraints and privacy mandates (Ke et al., 2020, Shi et al., 6 May 2025).
Standardization of benchmarks and evaluation metrics: Continued emphasis on open, multimodal datasets (SynSHRP2, Cyc-CP, NIDB), rigorous annotation protocols, and challenge frameworks (Shi et al., 6 May 2025, Li et al., 2023).
Integration across modes and stakeholders: Extension of telematics-captured NME analysis beyond automotive to cycling, micro-mobility, and pedestrian safety; coupling with city-scale digital twins and policy-making platforms (Li et al., 2023, Anis et al., 23 Jul 2024).

A plausible implication is that, as NMEs become more universally detected and analyzed, the field is converging toward continuous, context-aware, and statistically rigorous risk quantification in support of Vision Zero and next-generation mobility systems.