SuperWing: Transonic Wing Simulation Dataset
- SuperWing is a comprehensive collection of high-fidelity RANS simulations for parameterized 3D transonic wing geometries.
- It employs advanced geometric parameterization and structured mesh generation to cover an extensive design space with realistic flow conditions.
- Benchmarking reveals that deep learning surrogate models, such as ViT, achieve significant accuracy improvements in aerodynamic predictions.
The SuperWing dataset is a comprehensive, large-scale collection of high-fidelity Reynolds-Averaged Navier–Stokes (RANS) solutions for parameterized three-dimensional transonic wing geometries. Developed to fill the longstanding gap in diverse, open-access aerodynamic datasets, SuperWing addresses the challenge of providing a sufficiently broad pre-training corpus for modern deep learning–based aerodynamic surrogate models. It comprises 4,239 uniquely parameterized, “from-scratch” kinked wing shapes with 28,856 converged flow solutions covering a physically realistic transonic envelope, and is openly released under a permissive license to facilitate research in aerodynamic prediction, active design, and transfer learning (Yang et al., 20 Apr 2026, Yang et al., 16 Dec 2025).
1. Geometric Parameterization and Diversity
SuperWing employs an expressive geometric parameterization to cover a wide range of planform and sectional variations characteristic of modern swept wings. Each wing is generated by sweeping a single baseline airfoil—parameterized using a 9th-order Class–Shape Transformation (CST) series for both upper and lower surfaces—along a kinked planform and modulating spanwise section properties via low-order splines.
- Global planform parameters: Five continuously sampled scalars define the leading-edge sweep angle , aspect ratio , taper ratio , break-span fraction , and root chord adjustment .
- Sectional properties: The baseline airfoil is described by 20 (or equivalently, two sequences of 9 CST coefficients and 2 additional spanwise/sectional parameters including thickness ratio and camber ratio for the root and control points).
- Spanwise variations: Maximum airfoil thickness, camber, dihedral, and twist distributions are encoded as cubic B-splines with 5 control points each, sampled independently to encourage nontrivial variations.
- Total degrees of freedom: 37–38 per wing; all parameter values are provided as part of per-case metadata (Yang et al., 16 Dec 2025, Yang et al., 20 Apr 2026).
Principal-component analysis of geometric representations shows that 99% of the wing-shape variance is captured by the first 5 modes and 99.9% by the first 11 modes, indicating the intrinsic dimensionality of the sampled design space is moderate despite high explicit DOF (Yang et al., 20 Apr 2026).
2. Simulation Workflow and Flow Conditions
Each geometry is independently evaluated under eight operating points drawn uniformly from (Mach number) and (angle of attack), with a fixed Reynolds number and freestream temperature K (Yang et al., 16 Dec 2025, Yang et al., 20 Apr 2026).
The governing equations are steady-state compressible RANS, discretized and solved using the ADflow finite-volume solver:
- Mesh generation: Structured multi-block surface grid with cells (chordwise 0 spanwise) for the surface; extruded to a 3.6 million-cell volume O-grid using pyHyp and MDOLab meshing tools.
- Turbulence model: One-equation Spalart–Allmaras closure.
- Numerical procedure: Three-level geometric multigrid, up to 4,000 cycles, with Newton–Krylov linear solves; convergence required residual decrease below 1 and stability in integrated forces (final 10-iteration 2 fluctuation 3).
- Boundary conditions: No-slip wall (wing), far-field characteristic (outflow), centerplane symmetry.
- Mesh-convergence: Benchmarked on canonical CRM configurations, with drag-count errors 4 between meshes of 1.3, 3.6, and 8.9 million cells, justifying the “medium” mesh resolution for production runs (Yang et al., 16 Dec 2025).
3. Output Fields, Metadata, and File Organization
Each flow solution is accompanied by a standardized suite of outputs, provided in directly usable formats for ML research:
- Geometry representation: 5 cell-center array on a 6 grid (geometry.npy/geom0.npy).
- Surface flow fields:
- Pressure coefficient 7,
- Skin friction 8 (surface tangential and spanwise components).
- Integrated coefficients (metadata): 9, 0, 1 normalized by the reference area and dynamic pressure.
- Full parametric provenance: All 37–38 shape variables, operating condition 2, mesh provenance, and derived metrics in structured per-sample metadata (JSON/Configs.dat/index.npy).
All flow and geometry fields are stored in compressed NumPy or PyTorch archives, with raw ADflow CGNS files (surface/volume mesh, full field) made available on request. Dataset size is approximately 38 GB for processed fields, with 35.7 TB of raw solver output retained for advanced usage (Yang et al., 16 Dec 2025, Yang et al., 20 Apr 2026).
4. Access, Licensing, and Utilization
SuperWing is fully open-access and hosted at Hugging Face “yunplus/SuperWing” (https://huggingface.co/datasets/yunplus/SuperWing), with a “download.sh” script for convenient LFS streaming. All data is released under CC BY 4.0, enabling unrestricted academic or commercial use with attribution. Companion code and data loaders are provided at https://github.com/tum-pbs/AeroTransformer.
A standardized file structure ensures straightforward indexing: each sample’s geometry, flow fields, and configuration data are linked and indexed, facilitating data slicing (by sweep, AR, airfoil family, operating point, etc.) and transfer-learning–oriented data grouping (Yang et al., 20 Apr 2026, Yang et al., 16 Dec 2025).
5. Benchmarking and Impact on Aerodynamic Surrogate Models
SuperWing enables the development, benchmarking, and deployment of generalizable surrogate models, particularly foundation-model–style deep architectures such as Transformers. Key benchmarking results using U-Net, Vision Transformer (ViT), and Transolver architectures for surface-flow prediction were reported using a 90/10 split, cross-validated three times:
| Model | 4 MAE (%) | 5 MAE (6) | Training Time (h) | Parameters (M) |
|---|---|---|---|---|
| U-Net | 1.10 | 14.78 | 17.1 | 9.2 |
| ViT | 0.33 | 2.48 | 12.6 | 4.5 |
| Transolver | 0.36 | 2.53 | 37.9 | 3.8 |
ViT achieves drag prediction within ~2.5 counts, over 6× more accurate than U-Net, while transonic shock and surface skin-friction features are robustly predicted, including for out-of-distribution (zero-shot) CRM and DLR-F6 wings (Yang et al., 16 Dec 2025). Pre-training on SuperWing and additional targeted fine-tuning on CRM-perturbed geometries yields state-of-the-art 0.36% MAE on surface 7, reducing error by 84% relative to training from scratch (Yang et al., 20 Apr 2026). This suggests that the diversity of the SuperWing parameterization enables effective foundation surrogate training and domain transfer.
6. Dataset Significance and Research Applications
SuperWing’s design supports multiple research and application domains:
- Surrogate modeling: Its broad geometric and operational diversity allows pre-training of large foundation surrogate models (e.g., AeroTransformer) for three-dimensional aerodynamic prediction with minimal downstream fine-tuning (Yang et al., 20 Apr 2026).
- Active and automated design: The dataset structure, standardized mesh, and explicit provision of parameter metadata enable integration with gradient-based or generative design loops, bypassing the need for mesh remeshing or data wrangling.
- Transfer learning and generalization: Empirical studies show that models trained on SuperWing generalize with high fidelity to complex real-world test cases not seen during training (e.g., DLR-F6 and NASA CRM), validating the dataset’s coverage of practical design spaces (Yang et al., 16 Dec 2025).
- Open science and reproducibility: The full release—with metadata, processed and raw fields, and utility code—supports open, reproducible research and benchmarking.
A plausible implication is that SuperWing represents a foundational resource for three-dimensional aerodynamic surrogate learning, supporting novel methods in foundation-model pre-training, transfer learning, and rapid data-driven optimization.
7. Future Directions and Limitations
Ongoing work aims to extend the range of simulated Reynolds numbers, include additional physics (e.g., buffet or off-design stall phenomena), and increase mesh resolution for higher-fidelity applications. The current scope is limited to Mach numbers 8–9, mid-range angles of attack 0–1, and ADflow RANS/Spalart–Allmaras simulations. A plausible implication is that incorporating broad flow-regime coverage and multi-fidelity or LES data may further enhance the utility of SuperWing for next-generation aerodynamic foundation models.
References:
(Yang et al., 20 Apr 2026) "Towards a Foundation-Model Paradigm for Aerodynamic Prediction in Three-dimensional Design" (Yang et al., 16 Dec 2025) "SuperWing: a comprehensive transonic wing dataset for data-driven aerodynamic design"