LagrangeBench: Benchmarking Lagrangian Surrogates

Updated 28 January 2026

LagrangeBench is an end-to-end benchmarking suite for evaluating learned Lagrangian fluid solvers with a focus on temporal coarse-graining.
It provides rigorous datasets, a modular JAX-based software stack, and implementations of various graph neural network surrogate architectures.
Its evaluation framework uses classical and physics-informed metrics to ensure reproducibility and practical relevance in fluid simulation tasks.

LagrangeBench is an end-to-end benchmarking suite designed for learned Lagrangian (particle-based) fluid solvers, with a particular emphasis on temporal coarse-graining in surrogate modeling. It focuses on machine learning for Lagrangian particle problems, providing a rigorously constructed set of datasets, a modular JAX-based software stack, implementations of several graph neural network (GNN) surrogate architectures, and a suite of evaluation metrics, including physical metrics beyond classical position errors. LagrangeBench addresses the underrepresentation of Lagrangian approaches in existing learned partial differential equation (PDE) surrogate research, where Eulerian, grid-based methods are dominant. Through its integrative approach, it aims to catalyze advances in the development, comparison, and reproducibility of neural PDE solvers operating on irregular point clouds, facilitating progress in scientific computing, computational physics, and machine learning for fluid mechanics (Toshev et al., 2023).

1. Motivation and Scope

Most machine learning surrogates for PDEs have targeted Eulerian discretizations, where structured grids allow the deployment of convolutional neural networks and spectral operators. However, Lagrangian particle methods, particularly Smoothed Particle Hydrodynamics (SPH), are often preferred for phenomena involving free surfaces, complex boundary conditions, or multiphase and large-deformation flows. Despite the natural suitability of graph neural networks for representing unstructured particle systems, learned PDE solvers based on Lagrangian discretizations have been comparatively underexplored. LagrangeBench specifically addresses this gap.

A defining feature is its focus on temporal coarse-graining. High-fidelity SPH solvers typically require extremely small time steps ( $\Delta t \lesssim h/c_0$ , where $h$ is smoothing length and $c_0$ the artificial sound speed), resulting in millions of steps per simulation. LagrangeBench subsamples every 100th SPH step for dataset generation, increasing the difficulty and relevance of the surrogate modeling task by requiring effective prediction at large effective time steps—a practically essential regime for accelerating scientific workflows (Toshev et al., 2023).

2. Datasets

LagrangeBench introduces seven new benchmark datasets generated using the weakly compressible SPH scheme of Adami et al. (2012). The SPH solver integrates the following governing equations:

Barotropic EOS: $p(\rho) = c_0^2(\rho - \rho_0) + p_{bg}$
Density: $\rho_i = \sum_{j} m_j W(|r_i - r_j|, h)$ , with $W$ a quintic-spline kernel
Momentum: $\frac{d v_i}{dt} = -\sum_j m_j \left[\frac{p_i}{\rho_i^2} + \frac{p_j}{\rho_j^2} + \Pi_{ij}\right] \nabla_i W(|r_i - r_j|, h) + \frac{F_i}{\rho_i}$

In practice, $m_j = \rho_0 \Delta x^d$ , $h \approx 1.5\, \Delta x$ . All datasets employ a CFL number of $0.25$ and $c_0 \approx 10 \max|v|$ . For each simulation, every 100th frame is stored, producing temporally coarse-grained trajectory data. The time axis is split $2{:}1{:}1$ for training, validation, and testing. The datasets consist of four 2D and three 3D classical fluid-mechanical test cases characterized by diverse physics, including periodic boundaries, no-slip walls, free surfaces, and moving lids.

Case	Dim.	$N_{particles}$	Domain	Features
Taylor–Green vortex	2D	2500	$[0,1]^2$ , periodic	Pure fluid
Reverse Poiseuille	2D	3200	$[0,1]\times[0,2]$ , periodic	Body force
Lid-Driven Cavity	2D	2708	$1.12\times1.12$ , no-slip	Moving top wall
Dam Break	2D	5740	$5.486\times2.12$	Free surface, solid walls
Taylor–Green vortex	3D	8000	$[2\pi]^3$ , periodic	Pure fluid
Reverse Poiseuille	3D	8000	$1\times2\times0.5$ , periodic	Body force
Lid-Driven Cavity	3D	8160	$1.25\times1.25\times0.5$	Side walls, moving lid

These datasets enable benchmarking ML surrogates over canonical flows with diverse boundary conditions and physics (Toshev et al., 2023).

3. Architecture and Software Stack

LagrangeBench is implemented in JAX, featuring full just-in-time (jit) compilation and auto-vectorization. Configuration utilizes human-readable YAML files specifying model structure, optimizer, data paths, batch sizes, and training tricks (such as noise injection and push-forward loss). The data-loading pipeline adopts the PyTorch DataLoader interface, supporting standard batching, shuffling, and MPI-based parallelism.

Neighbor search, crucial for SPH and message-passing GNNs, is abstracted via three interchangeable back-ends:

jaxmd_vmap: The default cell-list $+$ vmap implementation from JAX-MD; optimized for speed but memory-intensive ( $\mathcal{O}(N\cdot cand)$ allocations).
jaxmd_scan: Serializes jaxmd_vmap with chunking via lax.scan, achieving approximately $5\times$ lower memory footprint, enabling up to 3 million particles on a 48 GB GPU.
matscipy: CPU-based neighbor search, supporting dynamic particle counts and batching at the cost of GPU–CPU copy overhead (50% extra runtime for 10k particles) (Toshev et al., 2023).

The suite supports multi-device training and exploits full JAX parallelization.

4. Surrogate Model Classes

Implemented surrogate models all adhere to a message-passing (MP) abstraction, whereby each network layer updates node features based on local neighborhoods. At layer $l$ :

Edge update: $e_{ij}^{(l)} = \phi_e(h_i^{(l)}, h_j^{(l)}, ||p_i - p_j||)$
Node update: $h_i^{(l+1)} = \phi_n(h_i^{(l)}, \sum_{j \in N(i)} e_{ij}^{(l)})$

After several layers, output heads decode predicted accelerations or velocities, which are then integrated to yield positions. The following GNN variants are benchmarked:

Graph Network Simulator (GNS): Encoder–processor–decoder stack of fully connected layers with layer norm. Input noise injection via Gaussian perturbations. Offers flexibility to predict either accelerations or velocities. Sensitive to noise (Toshev et al., 2023).
E(n)-Equivariant GNN (EGNN): Maintains scalar and vector features separately; position update uses distance and direction information. Implements the simplest form of equivariance (translation–rotation–reflection) without Clebsch–Gordan couplings.
Steerable E(3)-GNN (SEGNN): Utilizes steerable MLPs with learnable linear Clebsch–Gordan tensors and spherical-harmonics-based descriptors. Ensures full rotational, reflectional, and translational equivariance (for $L$ up to 1).
PaiNN: An extension of a molecular-dynamics architecture, adapted to accept nonzero initial vector node features. Radial embedding using a Gaussian basis and cosine cutoff; gated read-out for vectorial variables (Toshev et al., 2023).

The following table summarizes key hyperparameters:

Model	Layers	Latent Dim	Special Characteristics
GNS-10-128	10	128	2 MLP blocks, LayerNorm
SEGNN-10-64 $^{L=1}$	10	64	Spherical-harmonics order $L=1$
EGNN-5-128	5	128	Simple equivariance
PaiNN-5-128	5	128	3× vector feature uplift

Training employs Gaussian random-walk noise ( $\sigma\in[3\cdot 10^{-4}, 10^{-3}]$ ) and push-forward (PF) multi-step loss, rolling out the model for $k\in\{0,1,2,3\}$ steps with respective probabilities $[0.8,0.1,0.05,0.05]$ (Toshev et al., 2023).

5. Evaluation Metrics and Baselines

LagrangeBench quantifies surrogate performance with a set of both classical and physics-informed metrics:

Position MSE: $MSE_n = (1/(N n)) \sum_{t=1}^n \sum_{i=1}^N \|\mathbf{p}_i^{t,pred} - \mathbf{p}_i^{t,true}\|^2$
Kinetic Energy MSE: $E_{kin}(t) = \frac{1}{2} \sum_i m_i\|\mathbf{v}_i(t)\|^2$ ; $MSE_{E_{kin}} = \frac{1}{T} \sum_t (E_{kin}^{pred}(t) - E_{kin}^{true}(t))^2$
Sinkhorn Distance: $W_\epsilon(\mu_{pred}, \mu_{true})$ —an entropic regularized optimal-transport metric approximating Earth-Mover’s distance between predicted and reference particle distributions, computed on a common grid (Toshev et al., 2023).

Empirical results indicate that non-equivariant GNS performs best on boundary-driven problems (e.g., Lid-Driven Cavity, Dam Break), while SEGNN excels on flows with free surfaces and periodic boundaries (e.g., Taylor–Green vortex, Reverse Poiseuille). EGNN is unstable on larger systems, and PaiNN, although successful in other domains, underperforms in terms of position errors.

Representative 20-step rollout results (best single-seed average):

Case	Model	$MSE_{20}$	Sinkhorn	$MSE_{E_{kin}}$
2D TGV	SEGNN-10-64	$\approx4.4\times10^{-6}$	$2.1\times10^{-7}$	$4.5\times10^{-7}$
2D RPF	GNS-10-128	$\approx3.3\times10^{-6}$	$1.4\times10^{-7}$	$1.7\times10^{-5}$
2D LDC	GNS-10-128	$\approx1.4\times10^{-5}$	$1.0\times10^{-6}$	$3.7\times10^{-3}$
2D DAM	GNS-10-128	$\approx3.3\times10^{-5}$	$1.4\times10^{-5}$	$1.3\times10^{-4}$
3D TGV	SEGNN-10-64	$\approx5.2\times10^{-3}$	$6.4\times10^{-6}$	$2.7\times10^{-2}$
3D RPF	SEGNN-10-128	$\approx1.8\times10^{-5}$	$2.9\times10^{-7}$	$3.5\times10^{-6}$
3D LDC	GNS-10-128	$\approx4.0\times10^{-5}$	$6.0\times10^{-7}$	$2.6\times10^{-8}$

(Toshev et al., 2023)

6. Reproducibility, Workflow, and Extensibility

LagrangeBench employs an open-source MIT license and provides its datasets (HDF5 + JSON) via both GitHub and Zenodo (doi:10.5281/zenodo.10021925). Installation requires standard Python packages:

1 2	pip install --upgrade jax jaxlib matscipy e3nn jax-md git clone https://github.com/tumaer/lagrangebench.git

A canonical workflow is as follows:

Download and unpack datasets.
Adjust a YAML configuration file to specify dataset paths, model, and hyperparameters.
Launch training via python train.py --config ....
Evaluate and compute rollouts via python eval.py --config ... --split=test --n_steps=20.
Post-process and visualize results using provided notebooks.

The code structure comprises modules for data preprocessing, model definitions (GNS, EGNN, SEGNN, PaiNN), training and evaluation logic, and physics-informed loss functions. All operations are fully JIT-compiled, support multi-device training (via JAX pmap), and modular extension.

Planned extensions include datasets for multi-phase flows (e.g., Rayleigh–Taylor), surface tension, and scalability mechanisms such as domain decomposition for simulations involving tens of millions of particles. A plausible implication is that LagrangeBench’s modularity and rigorous physical grounding position it as a reference point for benchmarking and advancing learned Lagrangian surrogate models in computational science (Toshev et al., 2023).

Markdown Report Issue Upgrade to Chat

References (1)

LagrangeBench: A Lagrangian Fluid Mechanics Benchmarking Suite (2023)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to LagrangeBench.

LagrangeBench: Benchmarking Lagrangian Surrogates

1. Motivation and Scope

2. Datasets

3. Architecture and Software Stack

4. Surrogate Model Classes

5. Evaluation Metrics and Baselines

6. Reproducibility, Workflow, and Extensibility

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

LagrangeBench: Benchmarking Lagrangian Surrogates

1. Motivation and Scope

2. Datasets

3. Architecture and Software Stack

4. Surrogate Model Classes

5. Evaluation Metrics and Baselines

6. Reproducibility, Workflow, and Extensibility

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research