CarCrashNet: A Large-Scale Dataset and Hierarchical Neural Solver for Data-Driven Structural Crash Simulation

Published 8 May 2026 in cs.LG and physics.comp-ph | (2605.07098v1)

Abstract: Crash simulation is a cornerstone of modern vehicle development because it reduces the need for costly physical prototypes, accelerates safety-driven design iteration, and increasingly supports virtual testing workflows. At the same time, modeling structural crash mechanics remains exceptionally challenging: the response is governed by nonlinear contact, large deformation, material plasticity, failure, and complex multi-body interactions evolving over space and time on high-resolution finite-element meshes. In this work, we introduce \textsc{CarCrashNet}, a public high-fidelity open-source benchmark for data-driven structural crash simulation. \textsc{CarCrashNet} combines component-scale and full-vehicle simulations in a multi-modal format, including more than 14{,}000 bumper-beam pole-impact simulations with varying geometry, materials, and boundary conditions, together with 825 full-vehicle crash simulations built from three industry-standard vehicle models of increasing structural complexity: Dodge Neon, Toyota Yaris, and Chevrolet Silverado. To establish the reliability of the benchmark, we validate our open-source finite-element workflow based on OpenRadioss against both experimental crash data and the commercial solver Ansys LS-DYNA. We also introduce \textsc{CrashSolver}, a machine-learning model designed for full-vehicle crash prediction from high-resolution finite-element crash data. We further perform extensive benchmarking across the released datasets and evaluate \textsc{CrashSolver} against state-of-the-art geometric deep learning and transformer-based neural solvers. Our results position \textsc{CarCrashNet} as a foundation for reproducible research in structural simulation, crashworthiness modeling, and AI-driven virtual crash testing. The dataset is available at https://github.com/Mohamedelrefaie/CarCrashNet.

Abstract PDF Upgrade to Chat

Authors (4)

Summary

The paper presents CarCrashNet, a large-scale dataset with over 14,000 component-level and 825 full-vehicle crash simulations, rigorously validated against industry benchmarks.
The paper introduces CrashSolver, a hierarchical, mesh-aware transformer that leverages part-level decomposition for high-fidelity spatiotemporal field prediction in crash simulation.
The paper benchmarks CrashSolver against state-of-the-art models, demonstrating significant RMSE improvements and establishing a promising path for trustworthy ML surrogates in vehicle safety design.

CarCrashNet: Large-Scale Open Benchmarking and Hierarchical Neural Surrogates for Structural Crash Simulation

Motivation and Context

Structural crash simulation underpins vehicle safety engineering, guiding design iteration and virtual testing with significant societal and economic impact. Modern automotive safety development relies on high-fidelity finite-element (FE) simulations to minimize the costs associated with physical crash experiments and streamline the validation process. However, the domain presents major computational and ML challenges due to extreme nonlinearity, geometric complexity, and high-resolution spatiotemporal field prediction requirements. Unlike adjacent fields such as fluid mechanics and climate modeling, structural crashworthiness has lacked validated, open, large-scale datasets suitable for ML architectures capable of learning from mesh-based full-field FE data.

CarCrashNet (2605.07098) fills this critical gap by providing a multi-modal, rigorously validated, and openly available dataset spanning both component-level and full-vehicle crash simulations. In addition, it introduces CrashSolver, a hierarchical, mesh-aware transformer model that leverages part-level decomposition for improved surrogate accuracy. The paper provides a comprehensive workflow from dataset curation and validation against experimental and commercial reference solvers, to extensive benchmarking of state-of-the-art ML surrogates on full-vehicle field prediction tasks.

Dataset Composition and Validation

CarCrashNet comprises two principal components: a large-scale bumper-beam pole-impact dataset with over 14,000 FE simulations, and a full-vehicle frontal crash dataset containing 825 high-resolution simulations across three distinct industry-standard vehicle models (Dodge Neon, Toyota Yaris, Chevrolet Silverado).

Bumper-Beam Pole-Impact Dataset: Simulations systematically vary seven engineering/control variables including impact velocity, geometry (thicknesses), materials (yield strengths), and pole impact conditions—spanning a broad and representative space for crashworthiness studies. Each simulation outputs both scalar crash metrics (e.g., maximum contact force, internal energy, peak deceleration) and full spatiotemporal fields, suitable for both tabular and field-level surrogate modeling.
Full-Vehicle Crash Dataset: FE simulations are conducted on three validated public vehicle models, with parametric variation of physically meaningful variables (impact velocity, front-support thickness, rail thickness). The design-of-experiments (DoE) sampling is constructed via Latin Hypercube sampling or maximin design strategies to ensure broad and balanced coverage, preserved across splits for reproducibility.
Validation: CarCrashNet rigorously validates the open-source OpenRadioss FE workflow against both LS-DYNA (industry-standard commercial FE solver) and published physical automotive crash test data. Global wall force, duration, and internal energy metrics show agreement within 7–15% of LS-DYNA/experimental references, establishing suitability for both dataset generation and ML benchmarking.
Figure 1: Overview of the CarCrashNet framework with datasets (left), ML tasks and models (middle), and dataset generation and validation workflow (right).

Figure 2: Scalar wall-force validation summary comparing CarCrashNet simulation outputs with digitized NHTSA physical crash test data and LS-DYNA baseline.

Figure 3: Isometric post-crash deformation comparison between OpenRadioss and LS-DYNA, showing consistent global response.

Figure 4: Overlay of filtered time-history wall-force traces for OpenRadioss and LS-DYNA, demonstrating global temporal agreement.

Dataset Structure and Vehicle Diversity

The vehicle-scale dataset uses three discrete, topologically distinct public models, maximizing the statistical and structural diversity for benchmarking surrogate generalization and cross-architecture learning.

Figure 5: Baseline vehicle models (Dodge Neon, Toyota Yaris, Chevrolet Silverado) shown as exterior cutaways, highlighting variation in class and structure.

Figure 6: Front-support structural groups whose shell thicknesses are DoE parameters, varying primary frontal load paths.

Figure 7: Lower-rail and subframe DoE groups, further expanding design variability critical to energy absorption and deformation responses.

Each case comprises input vectors (design variables), time-resolved mesh fields, scalar histories, and auxiliary metadata, supporting a wide range of ML tasks from field prediction to reduced-order surrogate modeling.

Hierarchical Crash Field Prediction with CrashSolver

CrashSolver is a hierarchical transformer-based neural field architecture tailored for full-vehicle crash prediction from mesh-based FE data. It exploits the explicit FE part/semantic hierarchy: mesh nodes are grouped into physically interpreted components (rails, bumpers, subframe, cabin, etc.), and encoded via local geometry-aware attention mechanisms. Global interaction is mediated through transformer mixing and boundary message passing, followed by an autoregressive temporal decoder to predict the spatiotemporal mesh deformation during the crash event.

Figure 8: CrashSolver architecture schematic: semantic decomposition, multi-level component encoders, global transformer mixing, interface message passing, and trajectory forecasting.

This hierarchical approach is designed to capture long-range dependence and localized deformation, a necessity for high-fidelity crash simulation where stress-wave propagation, path-dependent damage, and load redistribution across complex, heterogeneous components are critical.

Benchmark Results and Quantitative Performance

CarCrashNet enables rigorous benchmarking across transformer-based (Transolver, GeoTransolver), convolutional point-unet (FIGConvUNet), and hierarchical models (CrashSolver) on field-level prediction tasks for each vehicle dataset. Performance is reported on strictly held-out splits, using MAE, RMSE, and multiple $L_2$ -normalized error metrics.

Strong claims substantiated by results:

CrashSolver achieves the lowest mean RMSE on all three vehicle datasets, with the performance differential increasing on more structurally complex vehicles.
- Dodge Neon: CrashSolver RMSE = 32.76 mm (best), followed by Transolver, with a 1.2 mm absolute improvement.
- Toyota Yaris: CrashSolver (21.77 mm) and GeoTransolver (21.77 mm) are statistically tied; both outperform ConvUNet and Transolver.
- Chevrolet Silverado: CrashSolver (61.54 mm) outperforms the next best baseline (GeoTransolver, 79.23 mm) and decisively outperforms transformer and ConvUnet baselines (RMSE gap > 20 mm).
- Significance analysis confirms the leading position of CrashSolver, particularly on complex scenarios.

Comprehensive ablation studies show that local component geometric encoding and part-aware conditioning are the most impactful architectural elements for surrogate accuracy, especially in heterogeneous, high-resolution crash simulations.

Component-Scale Surrogates and Tabular Modeling

On the bumper-beam dataset, classical tabular surrogates (CatBoost, LightGBM, XGBoost) outperform both linear and standard MLP baselines for scalar crashworthiness metrics, with $R^2$ values up to 0.82 on held-out test sets, demonstrating the value of nonlinear feature interaction modeling for even moderately sized engineering parameter spaces.

Implications and Future Directions

CarCrashNet meets a critical need for open, reproducible, and physically validated datasets in structural crashworthiness ML, enabling benchmarking, method development, and trustworthy surrogate deployment. The multi-level structure, from controlled component-scale benchmarks to full-vehicle field targets, supports surrogate modeling at all relevant engineering scales.

The CrashSolver results reinforce the importance of explicit structural decomposition and hierarchy in mesh field prediction for mechanical systems, suggesting a promising direction for foundation models in structural mechanics and for physics-informed neural surrogates in safety-critical virtual testing.

Theoretical Implications: The approach highlights the need for architecture-physics alignment in scientific ML for solid mechanics. The explicit modeling of parts and interfaces, together with transformer-based long-range attention and temporal rollouts, provides both accuracy and flexibility, with potential for transfer learning, cross-vehicle adaptation, and inverse design.

Practical Implications: Rigorous validation against industrial and experimental gold standards positions CarCrashNet and methodologically aligned surrogates as plausible components for real-world virtual homologation workflows, AI-driven design of crashworthy structures, and acceleration of design-test cycles in regulated engineering contexts.

Future Research Directions:

Extension to additional impact modes (side, offset, oblique), boundary conditions, and full vehicle classes beyond the current three.
Incorporation of uncertainty quantification, interpretability, and physically motivated error metrics.
Development of scalable foundation models integrating crash, durability, and multi-physics simulation tasks.
Deployment in closed-loop design optimization and virtual homologation with rapid ML-in-the-loop feedback.

Conclusion

CarCrashNet (2605.07098) establishes the foundational open dataset and neural surrogate architecture for data-driven structural crash simulation, emphasizing physical validation, diversity, and mesh-based learning. CrashSolver's hierarchical transformer demonstrates state-of-the-art accuracy for full-field field prediction on high-fidelity vehicle-scale tasks, especially as complexity scales. This work significantly lowers barriers for reproducible research and methodological advancement in ML-accelerated structural mechanics, and will shape future efforts in trustworthy AI and simulation-driven engineering design.

Markdown Report Issue