Roadside LiDAR for Cooperative Safety Auditing at Urban Intersections: Toward Auditable V2X Infrastructure Intelligence

Published 12 Apr 2026 in cs.ET | (2604.10419v1)

Abstract: Urban intersections expose the limitations of single-vehicle perception under occlusion and partial observability. In this study, we present an auditable roadside LiDAR framework for infrastructure-assisted safety analysis at a signalized urban intersection in New York City, developed and evaluated using real-world data. The proposed framework integrates trajectory construction, iterative human-in-the-loop quality assurance (QA), and interpretable near-miss analytics to produce defensible safety evidence from infrastructure sensing. Using a human-labeled heavy vehicle--bicycle interaction as an anchor case, we show that direction-agnostic time-to-collision (TTC) drops below 1s, while longitudinal TTC remains above conservative braking thresholds, revealing a lateral-intrusion-dominated conflict mechanism. Beyond individual cases, continuous-window evaluation and multi-round QA analysis demonstrate that the framework systematically reduces failure modes such as track fragmentation, spurious TTC triggers, unstable geometry, and cross-lane false conflicts. These results position roadside LiDAR as a practical post-hoc auditing mechanism for cooperative perception systems, with broader statistical validation discussed. This work provides a pathway toward scalable, data-driven safety auditing of urban intersections, enabling transportation agencies to identify and mitigate high-risk interactions beyond crash-based analyses.

Abstract PDF Upgrade to Chat

Authors (2)

Summary

The paper introduces an auditable roadside LiDAR pipeline that shifts perception from ego vehicles to fixed infrastructure, enhancing detection of near-miss events.
It employs a cascaded architecture combining CenterPoint 3D detection, SORT-style tracking with selective refinements, and a human-in-the-loop QA layer for robust safety analysis.
System performance demonstrates strong car and truck detection, effective surrogate safety analytics via decomposed TTC, and spatial conflict hotspot identification.

Roadside LiDAR for Cooperative Safety Auditing at Urban Intersections

Introduction and Motivation

This paper addresses the fundamental limitation of ego-centric perception pipelines in urban intersections, where occlusion, restricted viewpoints, and partial observability frequently undermine safe autonomous operation. By shifting the perceptual reference frame from the vehicle to fixed infrastructure, specifically through the deployment of roadside LiDAR, the paper proposes an auditable framework for infrastructure-assisted safety analysis targeted at safety-critical interactions. The approach aligns with broader V2X (Vehicle-to-Everything) paradigms but emphasizes auditability and interpretability—delivering reviewable artifacts and explicable outputs rather than expanding to communication robustness or real-time actuation.

System Design and Methodology

The system architecture consists of cascaded stages: 3D object detection via a CenterPoint model trained on a site-specific 8,000-frame annotated dataset, multi-object tracking with a SORT-style Kalman filter-based tracker, post-tracking conservative refinement in three controlled branches (raw, selective correction, and full smoothing), and an explicit human-in-the-loop QA layer for iterative validation and failure-mode annotation. Downstream, surrogate safety analytics interpret refined trajectories via classic surrogate-safety metrics, using both standard direction-agnostic Time-To-Collision (TTC) and a longitudinal decomposition to isolate interaction mechanisms.

Figure 1: Conceptual overview of the roadside LiDAR safety-auditing architecture.

The three post-tracking branches are distinct. The baseline (B0) retains raw tracker output; B1 introduces selective, registration-guided corrections applied only to high-suspicion segments, and B2 enforces globally consistent temporal smoothing at the cost of attenuating sharp, physical motions. A separate dynamics-aware stabilization extension is evaluated independently to enforce kinematically plausible motion and mitigate high-frequency noise without distorting event timing.

Crucially, all tracklets flagged as event candidates are reviewed in context with human operators, who score trajectory plausibility, correct misalignments, and annotate recurring false positives such as fragmented tracks, geometry instability, and spurious TTC triggers.

Pilot Deployment and Annotation Resource

The pilot is situated at a complex New York City intersection, leveraging an elevated Ouster OS-1-128 for coverage and reporting all ground-truth and prediction results in a unified bird’s-eye-view site frame.

Figure 2: Study intersection and roadside sensing setup used for the pilot deployment.

The annotation campaign delivers dense, frame-level cuboids for vehicles, pedestrians, and bicycles, with both frame-level ground truth (for detector training and holdout performance) and trajectory-level QA overlays uniquely designed for the auditing task rather than standard fusion or perception benchmarks.

Figure 3: Representative roadside LiDAR point cloud from the audited interaction window, with human-labeled truck and bicycle cuboids.

Detection and Tracking Performance

The CenterPoint detector adapted to the local domain achieves strong results for car and truck classes (native AP: car 89.75%, truck 54.55%) and modest performance for pedal cycles and pedestrians, consistent with the limited section scale and label availability. Real-time operation is feasible at 38.31 fps.

Figure 4: CenterPoint detection examples on validation data, highlighting strong car/truck performance and weaker small-object localization.

Branch-level tracking and refinement show that selective B1 refinement reduces yaw jitter from 15.19° to 14.04° p95, and B2 smoothing is more aggressive (down to 3.92°) but with a modest F1 degradation for object ID recall. Dynamics-aware heading stabilization further tightens heading-motion error to 2.36° without introducing timing distortions.

Sequential tracking recall remains high (0.8677 for 4,000 frames), but network performance decays on extended slices due to cumulative detection and association errors, emphasizing the need for repeated human-verified QA, especially in edge cases.

Near-Miss Interpretation and Surrogate Safety Analytics

The system’s core analytic contribution is the interpretable diagnosis of near-miss events through joint application of direction-agnostic and longitudinal TTC. A particular truck–bicycle interaction demonstrates this: although TTC drops rapidly below 1 s (minimum 0.62 s by human label, 0.55 s by model), the longitudinal TTC remains safely above braking thresholds, indicating conflict dynamics governed by lateral intrusion instead of insufficient stopping capacity.

Figure 5: Spatiotemporal interaction window for the truck–bicycle anchor case with color-coded trajectories and event markers.

Figure 6: Temporal curves for anchor interaction—vehicle speed, direction-agnostic TTC, and longitudinal TTC—highlighting lateral conflict features.

This analysis underscores the utility of decomposed TTC for mechanism-level insight during auditing, revealing cases where classic TTC would otherwise signal elevated risk erroneously attributed to longitudinal dynamics.

The model’s timing for minimum TTC and clearance aligns with human annotation within 0.2 s (2 frames) of the critical event. However, minimum clearance values are underestimated by the model, reflecting residual box-geometry estimation uncertainty.

Broader Conflict Mining and Hotspot Recurrence

Automated post-hoc mining extends the workflow over longer deployment runs, yielding over 6,000 candidate pairs and >600 provisional near-miss triggers, which on QA review are concentrated in the same intersection conflict zone. This spatial recurrence is qualitative support that the pipeline uncovers persistent, site-specific risk areas.

Figure 7: Post-8,000-frame vehicle–VRU near-miss overlay, showing central hotspot concentration.

While these events are not individually validated, the spatial pattern bolsters the case for infrastructure sensing as a leading indicator of latent intersection risk. The methodology thus supports statistical conflict hotspot analysis for agency-led risk assessment and targeted intervention.

Implications and Future Work

This work demonstrates that auditable trajectory analytics via roadside LiDAR, supported by structured human-in-the-loop QA and conservative motion refinement, can enable retrospective, mechanism-aware safety auditing at occlusion-prone urban intersections. The emphasis on auditability over raw detector novelty is deliberate, as the value is in defensibility and interpretable diagnostics, which are central to deployment in regulated safety contexts.

Implications extend to transportation agencies seeking defensible, evidence-based risk assessment that is not limited by crash rarity or reporting delay. The trajectory QA and event review paradigm also provides a foundation for training, benchmarking, and post hoc accountability within V2X cooperative systems. The explicit decoupling from real-time fusion, vehicle-side actuation, or communication latency means the results are not immediately generalizable to active intervention stacks but fill a key gap in transparent infrastructure-side event mining.

Future directions should address multi-site deployment, scale-up of human–machine QA loops, handling synchronization and calibration drift at network scale, and integrating learned residual stabilization for fine-grained geometry estimation. Additionally, robust statistical validation across diverse urban typologies and traffic conditions is essential for claims beyond pilot feasibility.

Conclusion

The paper establishes a site-focused, auditable LiDAR-based pipeline for cooperative safety auditing at urban intersections, combining reliable 3D detection, conservative tracking refinement, and mechanism-discriminative surrogate metrics with rigorous human-in-the-loop QA. The approach surfaces emergent near-miss patterns and mechanisms unobservable by ego-centric methods, with implications for post hoc safety validation, conflict hotspot mining, and future infrastructure intelligence. The presented methodology is a critical step toward scalable, interpretable, and agency-usable V2X safety analytics.

(2604.10419)

Markdown Report Issue