Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Navigating the Noise: Bringing Clarity to ML Parameterization Design with O(100) Ensembles (2309.16177v3)

Published 28 Sep 2023 in physics.ao-ph and cs.LG

Abstract: Machine-learning (ML) parameterizations of subgrid processes (here of turbulence, convection, and radiation) may one day replace conventional parameterizations by emulating high-resolution physics without the cost of explicit simulation. However, uncertainty about the relationship between offline and online performance (i.e., when integrated with a large-scale general circulation model (GCM)) hinders their development. Much of this uncertainty stems from limited sampling of the noisy, emergent effects of upstream ML design decisions on downstream online hybrid simulation. Our work rectifies the sampling issue via the construction of a semi-automated, end-to-end pipeline for $\mathcal{O}(100)$ size ensembles of hybrid simulations, revealing important nuances in how systematic reductions in offline error manifest in changes to online error and online stability. For example, removing dropout and switching from a Mean Squared Error (MSE) to a Mean Absolute Error (MAE) loss both reduce offline error, but they have opposite effects on online error and online stability. Other design decisions, like incorporating memory, converting moisture input from specific humidity to relative humidity, using batch normalization, and training on multiple climates do not come with any such compromises. Finally, we show that ensemble sizes of $\mathcal{O}(100)$ may be necessary to reliably detect causally relevant differences online. By enabling rapid online experimentation at scale, we can empirically settle debates regarding subgrid ML parameterization design that would have otherwise remained unresolved in the noise.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (9)
  1. Jerry Lin (9 papers)
  2. Sungduk Yu (16 papers)
  3. Tom Beucler (31 papers)
  4. Pierre Gentine (51 papers)
  5. Mike Pritchard (12 papers)
  6. Liran Peng (6 papers)
  7. Eliot Wong-Toi (4 papers)
  8. Zeyuan Hu (8 papers)
  9. Margarita Geleta (7 papers)
Citations (3)