Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 63 tok/s
Gemini 2.5 Pro 44 tok/s Pro
GPT-5 Medium 31 tok/s Pro
GPT-5 High 32 tok/s Pro
GPT-4o 86 tok/s Pro
Kimi K2 194 tok/s Pro
GPT OSS 120B 445 tok/s Pro
Claude Sonnet 4.5 35 tok/s Pro
2000 character limit reached

General Physics Transformer (GPₕyT)

Updated 18 September 2025
  • GPₕyT is a deep learning architecture that fuses transformer-based neural differentiation with classical numerical integration to simulate a wide range of physical phenomena without explicit equations.
  • The model processes simulation frames using unified spatiotemporal self-attention and incorporates local gradient information to accurately capture sharp features like shock fronts.
  • Benchmark evaluations show up to a 29-fold reduction in mean squared error compared to standard models, highlighting its zero-shot generalization across multiple physics domains.

The General Physics Transformer (GPₕyT) is a hybrid deep learning architecture that establishes a foundation model paradigm for computational physics. GPₕyT leverages the unified context-aware sequence modeling power of transformers, in combination with classical numerical integration schemes, to simulate a vast range of physical phenomena—including fluid-solid interactions, shock waves, thermal convection, and multi-phase flows—in a zero-shot setting, without explicit knowledge of the governing equations (Wiesner et al., 17 Sep 2025).

1. Architectural Foundations

GPₕyT is constructed as a dual-component system consisting of a transformer-based neural differentiator and an explicit numerical integrator. Physical state sequences, typically fields such as pressure, velocity, or temperature, are ingested as a stack of simulation frames. The model applies a linear transformation over both spatial and temporal axes, yielding non-overlapping tubelet-like patches which are augmented with absolute positional encodings. Stacks of these tokens are processed by transformer blocks with unified spatiotemporal self-attention, producing a tokenized representation suitable for context-aware sequence analysis.

The model concatenates first-order spatial and temporal derivatives—computed via central differences—along the channel dimension. This explicit inclusion of local gradient information facilitates the accurate handling of sharp solution features, such as shock fronts. The transformer then predicts the time derivative of the physical state, ∂X/∂t, over the prescribed spatial domain. The subsequent state is then computed using a classical time integration scheme, most commonly the Forward Euler method:

Xt+1=Xt+ΔtXttX_{t+1} = X_t + \Delta t\, \left. \frac{\partial X}{\partial t}\right|_t

Despite the simplicity of this first-order integrator, empirical ablations indicate no significant advantage in moving to higher-order schemes (e.g. RK4) for the bulk of tasks addressed.

2. Data Regime and Training Strategies

GPₕyT's training data regime spans over 1.8 TB of simulation data extracted from eight diverse public and private datasets, amounting to more than 2.4 million unique snapshots. This corpus includes incompressible and compressible flow, shock dynamics (Euler equations), obstacle flow, Rayleigh–Bénard convection, and multiphase flow through porous media.

Each training example consists of a sequence of consecutive state snapshots—between four and sixteen, depending on the dataset—which serve as the model's “prompt,” conditioning it on the recent trajectory and enabling in-context learning of underlying dynamics. Crucially, the time increments Δt are randomized, compelling the model to infer temporal scales from context rather than relying on fixed step size, and each dataset is individually normalized to accentuate learning of relative, rather than absolute, physical magnitudes.

A standard mean squared error loss between the predicted and ground-truth physical state at the next step is employed:

LMSE=E[Xt+1(pred)Xt+1(true)22]\mathcal{L}_{\text{MSE}} = \mathbb{E}\left[ \| X_{t+1}^{\text{(pred)}} - X_{t+1}^{\text{(true)}} \|_2^2 \right]

3. Unified Multi-Domain Performance and Comparative Analysis

GPₕyT demonstrates the ability to infer and apply governing physical dynamics across markedly heterogeneous physics domains, including:

  • Incompressible shear and obstacle flows (Navier–Stokes)
  • Compressible shock phenomena (Euler equations)
  • Buoyancy-driven convection
  • Multiphase flows (e.g., drainage and imbibition in porous substrates)

When benchmarked on one-step prediction tasks, GPₕyT achieves up to a 5-fold reduction in median MSE compared to standard UNet baselines and up to a 29-fold improvement over Fourier Neural Operator (FNO) models of similar scale. For both smooth and discontinuous systems, GPₕyT maintains sharp interfaces (e.g., shock fronts) and fine-scale coherent structures, exhibiting resilience against over-smoothing and loss of high-frequency detail over long prediction horizons.

4. Generalization, Zero-shot and In-Context Learning

A defining characteristic of GPₕyT is its foundation-model ability for zero-shot generalization. The transformer architecture enables in-context learning: previous state sequences act as prompts, allowing the model to infer system-specific dynamics at inference time without explicit access to governing equations.

Zero-shot experiments demonstrate that GPₕyT:

  • Accurately simulates systems with novel boundary conditions (e.g., open rather than periodic/symmetric)
  • Produces physical rollouts (e.g., bow shock formation) for types of flows (supersonic shock, turbulent radiative layer) absent from the training corpus
  • Maintains global invariances and produces plausible field evolution, despite increases in local error on tasks furthest from the training distribution

Notably, even with error accumulation in high-frequency components over 50-timestep rollouts, global flow structures and vortex coherence are preserved.

5. Governing Equations, Input Targeting, and Implicit Modeling

The GPₕyT paradigm departs from classical simulation in that explicit knowledge or imposition of partial differential equations (PDEs) is unnecessary. Instead, the model infers temporal evolution via data-driven approximation of the time derivative. For reference, training data include systems governed by equations such as:

  • Incompressible Navier–Stokes:

utνΔu+p=(u)u,u=0\frac{\partial \mathbf{u}}{\partial t} - \nu \Delta \mathbf{u} + \nabla p = -(\mathbf{u} \cdot \nabla)\mathbf{u}, \qquad \nabla \cdot \mathbf{u} = 0

  • Compressible Euler equations for shocks:

U=[ρ ρu ρv ρE],F,G=conserved fluxes\mathbf{U} = \begin{bmatrix} \rho \ \rho u \ \rho v \ \rho E \end{bmatrix}, \quad \mathbf{F},\mathbf{G} = \text{conserved fluxes}

The model is agnostic to these equations: only state sequences are provided.

6. Implications and the Path Toward Universal Physics Foundation Models

GPₕyT represents a paradigm shift relative to narrow, equation-specific neural surrogates. By training a single model on vast, multi-domain, multi-scale data, and relying on context-aware neural architectures, GPₕyT demonstrates that foundation model behavior—train once, deploy across domains—is achievable in physics.

This has implications for:

  • Democratizing access to scientific simulation by reducing the need for domain-specific solver development and tuning
  • The possibility of universal physics foundation models (PFMs) that can be extended to 3D, additional physical domains (e.g., mechanics, chemistry), and arbitrary boundary conditions
  • The acceleration of computational science via direct, high-fidelity surrogate modeling deployable in diverse scientific and engineering environments

Key directions identified include improving stability for long-horizon rollouts, extending to higher-dimensional and multi-resolution systems, and further scaling model and data.

The GPₕyT advances on approaches constrained to a single physics domain or equation family (e.g., PINNsFormer (Zhao et al., 2023), PDE-Transformer (Holzschuh et al., 30 May 2025)) by demonstrating that transformer models can infer physical processes directly from data, including local gradient features, over a large heterogeneity of phenomena. Contrasted with approaches that require equation knowledge and tailored model architectures, GPₕyT achieves accurate, physically plausible extrapolation—establishing the feasibility of physics foundation models analogous to those proved effective in language and vision domains (Wiesner et al., 17 Sep 2025).

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to General Physics Transformer (GP\textsubscript{hy}T).