GeoSURGE: Surrogates for Hydrodynamics & Geo-Localization
- GeoSURGE is a set of advanced surrogate models that blend physics-informed simulations with machine learning to deliver real-time predictions in hydrodynamics, coastal hazards, and geo-localization.
- It employs a geometry-aware neural surrogate using SDFs and multi-layer perceptrons to accurately predict per-surface forces for vehicle hydrodynamics.
- The framework also integrates large-scale coupled coastal models and contrastive learning for visual geo-localization, achieving state-of-the-art performance on benchmark geospatial tasks.
GeoSURGE denotes several advanced computational and machine learning frameworks addressing diverse challenges in hydrodynamics modeling, global coastal hazard assessment, and visual geo-localization. Below, the key GeoSURGE systems are described in technical depth, with attention to their mathematical foundations, architectures, datasets, evaluation metrics, and domain-specific impacts.
1. Geometry-Aware Surrogate for Real-Time Vehicle Hydrodynamics
GeoSURGE, as introduced in the context of hydrodynamics estimation for amphibious autonomous ground vehicles, is a neural surrogate model that provides geometry-resolved force predictions in real time (Waheed et al., 18 May 2026). The model bridges the gap between high-fidelity but computationally expensive CFD and the need for fast simulation and planning in robotics.
Key Formulation
The model operates on per-surface features extracted from a watertight vehicle mesh within an axis-aligned bounding box . It leverages a vehicle-specific Signed Distance Field (SDF),
where SDF, negative inside the hull and positive outside.
The mesh is partitioned into surface patches. For each, per-surface submergence is characterized by:
with m. Patch area projection is computed as
All features are normalized to reference vehicle dimensions.
Per-patch feature vectors , concatenating global and surface attributes, are mapped through a shared two-layer MLP with ReLU activation to predict per-surface force components: The physical per-surface force is 0, and net force aggregates 1.
Loss consists of a composite of MSE/relative error on normalized force, total-force consistency regularization, and a penalty for non-physical surface activation (dry patches).
Evaluation employs symmetric Mean Absolute Percentage Error (sMAPE): 2 with force denominators bounded below by 1 N to prevent inflation.
2. Training Data, Preprocessing, and Architecture
Training utilizes high-fidelity CFD data generated for two vehicles (Husky A200, Warthog) in shallow water domains, systematically varying velocities (3), yaw angles, fluid densities (4), and water depths. Each simulation runs 4 s (discarding transients), with 20 samples per case, and symmetry augmentation increases dataset size fourfold.
Global features include 5, 6, velocity components, and dimensionless numbers (7, 8, 9), alongside normalized geometry. Per-surface static (type, centroid, normal, area) and dynamic features (submergence metrics, projected area) are combined. Targets are force components normalized by training-set moments.
The neural architecture is an MLP with two hidden layers (256 width, ReLU), applied per surface, outputting 0 in normalized units. Optimization uses Adam (lr 1, halved on plateau), batch size 16, over 1000 epochs. Weight-sharing facilitates generalization across different hulls.
3. Quantitative Performance and Computational Characteristics
On held-out CFD, GeoSURGE reaches longitudinal-force sMAPE ≈13% for both platforms. Vertical-force sMAPE is 3% (Husky) and 12% (Warthog); higher lateral-force error (≈55%) arises due to sign-changing, low-magnitude loads. End-to-end inference achieves median latency ≈0.83 ms (CPU), with <0.90 ms at the 95th percentile, supporting 21 kHz planning/control rates. Thus, GeoSURGE enables physically faithful, per-surface hydrodynamics to be embedded into real-time vehicle simulation and control loops.
4. Real-World Physical Validation
GeoSURGE is validated through wading experiments with a full-scale Warthog platform in channelized water at 4, 8, and 10 in depths. OptiTrack motion-capture provides kinematics (111 Hz), and chassis altitude infers the water level. Data covers 57 planar segments (speeds 0.52–3.78 m/s, constant depth).
Model predictions recover fundamental hydrodynamic scaling laws observed in experiment:
- Drag–speed relation: 3 with 4 for all depths. Drag constant 5 increases with submersion—matching expected geometry-induced effects.
- Buoyancy–depth relation: 6, and 7, with 8 linearly scaling with depth, 9.
Neither relationship is hard-coded; both emerge from the model architecture's summation of local per-patch predictions. This provides empirical confidence in the inductive bias and real-world reliability of the surrogate.
5. GeoSURGE for Global Coastal Hazard Hindcasting
A second GeoSURGE system was developed for global-scale wave and storm surge hindcasting (Mentaschi et al., 2023). It constitutes a coupled hydrodynamic (SCHISM) and spectral wave (WWM-V) modeling system on an unstructured mesh with >650,000 nodes and 2–4 km coastal resolution.
Model Equations and Coupling
SCHISM solves barotropic shallow-water equations: 0
1
WWM-V resolves the wave action balance over 36 frequency bins and 24 directions, coupled to SCHISM via barotropic velocities and free surface (SCHISM2WWM-V) and by wave-induced roughness and radiation stresses (WWM-V3SCHISM) at synchronized time steps (4 s).
Mesh and Data
The unstructured mesh adapts resolution from 50 km offshore to 2 km at the coast, generated via OceanMesh2D. ERA5 reanalysis provides meteorological forcing (winds, pressure, ice), and the model hindcast spans 1970–2022.
Key data outputs every 3 hours at each node include sea surface height (SSH; tidal residual), significant wave height (5), spectral periods, currents, and wave spectral properties.
6. Hindcast Performance and Validation
Skill is quantified against altimeter (SSH, 6), tidal gauge, and buoy observations:
- SSH vs. altimeter: RMSE 7 0.079 m, RMSE(%) 8 17.5% (offshore), 9 (improved at higher latitudes).
- Tidal gauges: RMSE(%)=14.5%, 0 (mean); extremes RMSE(%)=20.1%, 1.
- 2: NRMSE ≈16.6% vs. altimeter, ≈25.8% vs. buoys (coast).
Limitations include barotropic (2D) dynamics (baroclinicity, stratification omitted), absent explicit tides, uncoupled coastal wave setup, and under-resolution of tropical cyclones and fine-scale eddies. Users may post-process with global tidal solutions for total water elevation.
Applications comprise compound flooding risk, shoreline change, storm surge defenses, long-term trend studies, and remote-sensing model validation.
7. GeoSURGE for Planet‐Scale Visual Geo-Localization
A third GeoSURGE system addresses visual geo-localization as contrastive alignment between semantically fused visual features and a multi-scale geographic embedding hierarchy (Daruna et al., 1 Oct 2025).
Key Components
- Hierarchical Geographic Embedding: The Earth is recursively partitioned into S2-based geocells, subdivided as needed for sample density, yielding 3 hierarchical levels. Each geocell at level 4 has a learned L2-normalized embedding 5, 6.
- Semantic Fusion Module: Appearance features from a CLIP ViT backbone are fused with segmentation features from OneFormer using latent cross-attention. RGB tokens 7 and semantic tokens 8 interact via
9
0
applied in three sequential attention–MLP blocks; the resulting representation is projected for downstream alignment.
- Contrastive Visual–Geography Alignment: For each image, ground-truth location identifies corresponding geocells at all levels. Positive pairs 1, 2 are aligned; the InfoNCE loss per level is
3
summed over images and levels.
- Training: The model is trained on 44M images (MediaEval Placing Tasks 2016), with AdamW (base lr 5, batch 1024). CLIP’s last transformer block and the fusion module are fine-tuned; OneFormer parameters are frozen.
Empirical Results
GeoSURGE achieves state-of-the-art on 22/25 benchmarks, including significant gains over GeoCLIP (flat GPS coordinate representations). On IM2GPS (Street@1 km–Continent@2500 km): GeoSURGE [27.0, 54.4, 70.0, 84.4, 93.2] vs. GeoCLIP [16.5, 40.9, 54.9, 76.8, 88.6]. Ablation confirms that decreasing hierarchy depth or omitting semantic fusion produces steep declines in prediction rates, highlighting their necessity.
Summary Table: Principal GeoSURGE Systems
| Domain/Problem Area | Core Approach | Notable Metrics/Outcomes |
|---|---|---|
| Vehicle Hydrodynamics Estimation | Geometry-aware per-surface neural surrogate | sMAPE (Fx): 13%; CPU <1 ms/samp |
| Global Coastal Surge & Waves | Coupled SCHISM+WWM-V, 2–4 km mesh, ERA5 forcing | SSH RMSE: 0.079 m (altimeter) |
| Planet-scale Visual Geo-localization | Hierarchical geo-embeddings + semantic fusion | IM2GPS@1km: 27.0%; 22/25 SOTA |
8. Future Directions and Known Limitations
In hydrodynamics, expansion to additional vehicle archetypes and coupling with more diverse environmental data is required for universal generalizability. For the global hindcast, future work prioritizes tidal–surge interaction, 3D baroclinicity, sub-kilometer coastal refinement, improved tropical cyclone winds, bias correction, and coupling to hydrological and urban-flood models. For geo-localization, there is scope in further leveraging hierarchical semantics, dataset scaling, and domain adaptation.
GeoSURGE, across its instantiations, exemplifies the integration of domain-specific geometry, physics, and hierarchical representation learning, providing computationally viable surrogates and enriched data products across robotics, coastal science, and vision–geospatial alignment contexts (Waheed et al., 18 May 2026, Mentaschi et al., 2023, Daruna et al., 1 Oct 2025).