A Two-Phase Deep Learning Framework for Adaptive Time-Stepping in High-Speed Flow Modeling

Published 9 Jun 2025 in cs.LG and physics.flu-dyn | (2506.07969v1)

Abstract: We consider the problem of modeling high-speed flows using machine learning methods. While most prior studies focus on low-speed fluid flows in which uniform time-stepping is practical, flows approaching and exceeding the speed of sound exhibit sudden changes such as shock waves. In such cases, it is essential to use adaptive time-stepping methods to allow a temporal resolution sufficient to resolve these phenomena while simultaneously balancing computational costs. Here, we propose a two-phase machine learning method, known as ShockCast, to model high-speed flows with adaptive time-stepping. In the first phase, we propose to employ a machine learning model to predict the timestep size. In the second phase, the predicted timestep is used as an input along with the current fluid fields to advance the system state by the predicted timestep. We explore several physically-motivated components for timestep prediction and introduce timestep conditioning strategies inspired by neural ODE and Mixture of Experts. As ShockCast is the first framework for learning high-speed flows, we evaluate our methods by generating two supersonic flow datasets, available at https://huggingface.co/datasets/divelab. Our code is publicly available as part of the AIRS library (https://github.com/divelab/AIRS).

Abstract PDF Chat (Pro)

Summary

The paper presents ShockCast, a two-phase framework combining a neural CFL model for timestep prediction with a neural solver for advancing fluid states.
It leverages physically-motivated features and innovative conditioning strategies to capture shock wave dynamics and improve simulation fidelity.
Experimental results on coal dust explosion and circular blast datasets demonstrate enhanced accuracy and speed in high-speed flow predictions.

This paper introduces ShockCast, a two-phase deep learning framework designed for modeling high-speed flows, such as those approaching or exceeding the speed of sound, which necessitate adaptive time-stepping to capture phenomena like shock waves efficiently. Traditional numerical solvers for such flows use adaptive time-stepping based on the Courant–Friedrichs–Lewy (CFL) condition, but these methods are not directly applicable to neural solvers that operate on coarsened space-time meshes and often model only a subset of physical variables.

ShockCast Framework

The core of ShockCast is a two-phase approach executed autoregressively during inference:

Neural CFL Phase: A machine learning model, termed the "neural CFL model" ( $\psi$ ), predicts the appropriate timestep size ( $\Delta t$ ) given the current fluid state ( $u(t)$ ).
Neural Solver Phase: A second machine learning model, the "neural solver" ( $\phi$ ), takes the current fluid state ( $u(t)$ ) and the predicted timestep ( $\Delta t$ ) as input to advance the system to the next state ( $u(t + \Delta t)$ ).

The inference pipeline (Figure 1 in the paper) alternates between these two phases:

$\hat{\Delta t} = \psi(\hat{u}(t))$

$\hat{u}(t+\hat{\Delta t}) = \phi(\hat{u}(t), \hat{\Delta t})$

This process is repeated until a predefined simulation end time.

Phase 1: Neural CFL Model Implementation

The neural CFL model is trained to emulate the timestep selection process of the classical solver used for data generation, but on the coarsened grids used by the neural solver. Its design incorporates physically-motivated features:

Input Features:
- Current flow state $u(t)$ .
- Spatial gradients $\nabla u(t)$ computed using finite differences, as adaptive timesteps depend on gradient sharpness.
- Optional "CFL features": local wave speed $\lambda(x,y)$ , velocity magnitudes $|u(x,y)|$ and $|v(x,y)|$ , and local sound speed $a(x,y)$ . These are derived from the classical CFL condition:
  
  $\lambda(x,y) = \max(|u(x,y)|+a(x,y), |v(x,y)|+a(x,y))$
  
  $a(x,y) = \sqrt{\gamma R T(x,y)}$
Architecture: The paper uses a ConvNeXt backbone.
Spatial Downsampling: Max pooling is explored as a spatial downsampling function, motivated by the CFL condition's reliance on the maximum wave speed ( $\lambda_{\max}$ ) across the domain.
Training: The model $\psi$ is trained to minimize a loss function (e.g., Mean Absolute Error - MAE) between its predicted timestep $\psi(u_j)$ and the actual timestep $\Delta_j = t_{j+1} - t_j$ from the training data.

$L_c = E_{j \sim D, u \sim U} [\mathcal{L}_c(\psi(u_j), \Delta_j)]$

Phase 2: Timestep Conditioning for Neural Solvers

The neural solver $\phi$ must effectively incorporate the variable timestep $\Delta t$ . The paper explores several conditioning strategies:

Time-Conditioned Layer Norm: Originally from diffusion models, this method embeds $\Delta t$ into scale ( $\beta$ ) and shift ( $\gamma$ ) vectors. These are applied after each Layer Normalization (LN) layer:

$h_{\text{out}} = \text{LN}(h_{\text{in}})(1+\beta(\Delta t)) + \gamma(\Delta t)$

This is the "Base" conditioning for U-Net, CNO, and Transolver in the experiments.
Spatial-Spectral Conditioning: For Fourier-based models like FNO that might not use layer normalization, conditioning occurs in the frequency domain. The Fourier transform of the feature map $\mathcal{F}(h)$ is point-wise multiplied by a complex-valued embedding $\boldsymbol{\xi}$ of $\Delta t$ :

$\mathcal{F}(h)_{\text{out}} = \mathcal{F}(h)_{\text{in}} \odot \boldsymbol{\xi}(\Delta t)$

$\boldsymbol{\xi}(\Delta t)$ has different entries for each frequency and is shared across channels for parameter efficiency. This is the "Base" conditioning for F-FNO.
Euler Residuals: Inspired by the connection between residual connections and Euler integration ( $u(t+\Delta t) \approx u(t) + \Delta t \partial_t u(t)$ ), this strategy scales the output of a residual block $F_l(h_l)$ by an affine transformation of $\Delta t$ :

$h_{l+1} = h_l + \alpha(\Delta t) F_l(h_l)$

where $\alpha(\Delta t) = w \Delta t + b$ , and $w, b$ are learnable parameters.
Mixture of Experts (MoE): The gating network $G_l$ in an MoE layer is conditioned only on the timestep $\Delta t$ . The outputs of the $K$ experts $F_{l,k}$ are weighted by the gate and scaled by an expert-specific affine transformation of $\Delta t$ :

$h_{l+1} = h_l + \sum_{k=1}^K G_l(\Delta t)_k (\alpha_k(\Delta t) F_{l,k}(h_l))$

This allows different experts to specialize in different integration period lengths (e.g., short steps for sharp gradients, long steps for smooth dynamics).

The neural solver $\phi$ is trained to minimize a one-step prediction error (e.g., relative error averaged over fields):

$L_s = E_{j \sim D, u \sim U} [\mathcal{L}_s(\phi(u_j, \Delta_j), u_{j+1})]$

Datasets and Experiments

Two new datasets of supersonic flows were generated for evaluation using the HyBurn CFD code, which employs adaptive mesh refinement (AMR):

Coal Dust Explosion: A multiphase problem simulating a shock wave interacting with a layer of coal dust in a channel. Parameters varied include initial shock Mach number (1.2 to 2.1) and particle diameter. Models predict gas velocity, gas temperature, and coal dust volume fraction. The training data is coarsened $500\times$ in time relative to the CFD solver and uses a $104 \times 104$ spatial grid.
Circular Blast: A 2D version of Sod's shock tube problem, with a circular high-pressure region expanding outwards. The initial pressure ratio is varied. Models predict velocity, temperature, and density fields. Data is coarsened $100\times$ in time and uses a $128 \times 128$ spatial grid (downsampled from $256 \times 256$ ).

Neural Architectures Used:

Neural CFL: ConvNeXt-T.
Neural Solvers:
- U-Net (modern variant)
- Convolutional Neural Operator (CNO)
- Factorized Fourier Neural Operator (F-FNO)
- Transolver (attention-based with learnable soft pooling)

Metrics:

One-step MAE for Neural CFL: For $\Delta t$ prediction.
Correlation Time Proportion: The fraction of simulation time for which Pearson correlation between prediction and ground truth stays above 0.9.
Mean Flow Relative Error: Error in time-averaged flow fields.
Turbulence Kinetic Energy (TKE) Relative Error: Error in the TKE field, calculated from velocity fluctuations.

Results:

Neural CFL:
- For the simpler circular blast (single-phase), the base ConvNeXt model performed best.
- For the complex multiphase coal dust explosion, adding spatial gradients, CFL features, and using max-pooling significantly improved $\Delta t$ prediction accuracy. The best neural CFL model closely matched ground truth $\Delta t$ in autoregressive rollouts.
ShockCast (End-to-End):
- Correlation Time: U-Net with time-conditioned layer norm (Base) achieved the best results on both datasets.
- TKE Error:
- Coal Dust: U-Net with MoE and Euler conditioning performed best.
- Circular Blast: F-FNO with Euler and MoE conditioning performed best.
- Mean Flow Error:
- Coal Dust: U-Net with time-conditioned layer norm (Base) performed best.
- Circular Blast: F-FNO with MoE conditioning performed best.

Visualizations (e.g., Figure 2 for circular blast density) show ShockCast capturing key flow features like shock propagation and reflection.

Implementation Considerations:

Data Coarsening: Neural solvers achieve speedup by learning on significantly coarsened temporal and spatial grids compared to classical solvers. The paper uses coarsening factors of $100\times$ to $500\times$ in time.
Computational Resources: Training was performed on A100 or RTX 2080 GPUs. Model parameter counts, FLOPs, and peak training memory are provided in Appendix Table 2, ranging from ~10M to ~174M parameters for neural solvers.
Training: Adam optimizer with cosine learning rate scheduling was used. Neural solvers trained for 400 epochs, neural CFL models for 800 epochs with noise injection.
Code Availability: The code is part of the AIRS library, and datasets are on Hugging Face.

Conclusion:

ShockCast is presented as the first machine learning framework for adaptive time-stepping in high-speed flow modeling. It effectively learns to predict appropriate timesteps and evolve fluid fields. The physically-motivated components in the neural CFL model and the novel timestep conditioning strategies for the neural solver contribute to its performance. The work is a step towards accelerating computationally intensive high-speed flow simulations using neural networks. Potential future work includes learning timestep adaptation policies that balance accuracy and cost, rather than just mimicking classical solver timesteps.