Papers
Topics
Authors
Recent
2000 character limit reached

NCAR CESM2 Large Ensemble Project

Updated 3 January 2026
  • NCAR CESM2 Large Ensemble is a comprehensive climate modeling initiative that uses 100 coupled simulations with perturbed initial conditions to isolate internal variability from forced responses.
  • It applies consistent external forcings from CMIP6 historical (1850–2014) and SSP3-7.0 future scenarios to support studies in predictability, uncertainty quantification, and ENSO dynamics.
  • The project leverages upgraded model components (CAM6, POP2, CLM5) and machine learning techniques to assess predictive performance and error scaling under climate change.

The NCAR Large Ensemble Project, specifically the CESM2 Large Ensemble (CESM2-LE), is a climate modeling initiative developed to robustly characterize internal climate variability and forced responses under past and future greenhouse-gas scenarios. By systematically generating a large set of perturbed-initial-condition simulations with identical external forcings, CESM2-LE enables precise statistical estimates—including means, variances, and extremes—of key climate variables, supporting studies in predictability, uncertainty quantification (UQ), and the response of the climate system to anthropogenic forcing (McAfee et al., 19 Dec 2025).

1. Scientific Motivation and Conceptual Overview

CESM2-LE is designed to isolate internal variability from externally forced climate responses. Each member of the ensemble consists of a fully coupled atmosphere–ocean simulation using the Community Earth System Model 2 (CESM2). All ensemble members receive identical time-varying external forcings—greenhouse gases, aerosols, solar, and volcanic—drawn from CMIP6 historical trajectories (1850–2014) and a high-emission scenario (SSP3-7.0, 2015–2100), but are initialized with infinitesimal atmospheric perturbations in 1850. This structure produces a distribution of climate outcomes solely reflecting internal variability for fixed boundary conditions and allows evaluation of forced response by subtracting the ensemble mean.

Novel aspects of CESM2-LE compared to previous large ensembles include its scale (100 independent simulations), use of upgraded physical parameterizations—such as CAM6 for clouds and aerosols, POP2 for ocean biogeochemistry, and CLM5 for land and snow—and its extension into the SSP3-7.0 high-emission future. This configuration enhances the fidelity of climate mode representations, such as the El Niño–Southern Oscillation (ENSO), and allows robust estimation of variability and extremes across multiple climate fields.

2. Ensemble Design, Model Configuration, and Forcing

CESM2-LE comprises 100 ensemble members (Mₘodel = 100), each integrating the coupled atmosphere–ocean–sea-ice–land components of CESM2 at nominal 1° horizontal resolution. Initialization is performed in 1850 by perturbing atmospheric states infinitesimally while keeping ocean, land, and sea-ice initial conditions identical. All members then evolve under identical external forcings. The atmospheric component utilizes CAM6 with improved cloud–aerosol interactions, the ocean uses POP2 with refined biogeochemistry, and land processes are represented by CLM5. External boundary conditions follow CMIP6 protocols: historical forcings (1850–2014) are succeeded by SSP3-7.0 scenario forcing (2015–2100).

This large member count (more than doubling previous efforts, such as CESM1-LE with 40 members) and the consistent external forcing protocol provide a statistical ensemble suitable for subtracting ensemble-mean signals to isolate internal variability and forced climate responses. The model domain covers global coupled dynamics, with marine climate variable analysis focused on latitudes equatorward of 45°.

3. Analyzed Variables, Standardization, and Anomaly Definition

Studies leveraging CESM2-LE frequently analyze anomalies in three monthly marine fields: sea-surface temperature (SST, denoted ΔT(φ, λ, t)), upper-ocean heat content (OHC, ΔH(φ, λ, t); defined as the vertical integral of ocean temperature to 300 m), and zonal surface wind stress (τₓ(φ, λ, t)). Each variable VV ∈ {T, H, τₓ} is preprocessed as follows:

  1. The ensemble mean μ_V(φ, λ, m) and standard deviation σ_V(φ, λ, m) are computed across the base period 1850–1949 for each calendar month mm.
  2. Standardized anomalies are then derived by:

ΔV(ϕ,λ,t)=V(ϕ,λ,t)μV(ϕ,λ,month(t))σV(ϕ,λ,month(t))\Delta V(\phi, \lambda, t) = \frac{V(\phi, \lambda, t) - \mu_V(\phi, \lambda, \text{month}(t))}{\sigma_V(\phi, \lambda, \text{month}(t))}

This procedure produces monthly standardized anomaly fields suitable for studying internal variability and distributional changes under anthropogenic forcing.

4. Application in Predictability, Machine Learning, and Uncertainty Quantification

CESM2-LE provides the foundation for modern machine learning–based climate prediction and uncertainty quantification. In recent work on ENSO predictability and UQ, data are split by ensemble member to avoid temporal autocorrelation: 20 members (1850–1949) for training, 5 for validation, and 25 (entire period 1850–2098) for testing (McAfee et al., 19 Dec 2025).

ENSO phase is categorized using the first principal component of tropical Pacific SST anomalies (calculated over the training period), with quartile binning for La Niña, Cold Neutral, Warm Neutral, and El Niño. The input to learning models consists of nine channels (three variables over three sequential months), predicting categorical distributions of ENSO class at leads from t+1 to t+24.

Models adopt a modified ResNet-18 architecture with circular padding and a learnable channel-reduction layer. Deep ensembles are constructed from M=100 independent components (distinct initialization and data order seeds), each trained via L2-regularized negative log-likelihood (NLL) loss. Three ensembles are trained on inputs through different cutoffs (1949, 2024, and 2098), allowing evaluation of generalization and covariate shift effects across the historical and SSP3-7.0 periods.

Distributional shift is quantified by differences in variable variance between a premodern baseline (1850–1949) and a “shifted” late 21st-century period (2040–2098). Changes in predictive performance and UQ metrics are traced over time using spatial variance maps and time series of component-mean NLL and ensemble disagreement.

5. Uncertainty Quantification: Aleatoric vs. Epistemic Uncertainty

Within deep ensemble methodology, CESM2-LE supports rigorous separation of aleatoric and epistemic uncertainty. For an ensemble of MM probabilistic neural networks parametrized by weights {wi}i=1M\{w_i\}_{i=1}^M, class probabilities pwi(yx)p_{w_i}(y | x) over KK ENSO categories are used to compute:

Aleatoric Uncertainty (AU):

Average predictive entropy approximates H(YW)H(Y|W),

AU(x)=1Mi=1Mk=1Kpwi(y=kx)logpwi(y=kx).AU(x) = -\frac{1}{M} \sum_{i=1}^{M} \sum_{k=1}^{K} p_{w_i}(y=k | x) \log p_{w_i}(y=k | x).

Epistemic Uncertainty (EU):

Ensemble disagreement (class-probability variance),

pˉ(y=kx)=1Mi=1Mpwi(y=kx)\bar{p}(y=k|x) = \frac{1}{M} \sum_{i=1}^M p_{w_i}(y=k | x)

EU(x)=1Mi=1Mk=1K[pwi(y=kx)pˉ(y=kx)]2.EU(x) = \frac{1}{M} \sum_{i=1}^M \sum_{k=1}^K \left[ p_{w_i}(y=k | x) - \bar{p}(y=k | x)\right]^2.

AU quantifies irreducible unpredictability inherent in the data conditional on input. EU measures uncertainty in model weights—reflecting the confidence or lack thereof due to model, data, or distributional limitations.

6. Key Findings and Implications for Climate Prediction

Leveraging CESM2-LE, several principal results emerge regarding prediction and uncertainty quantification under climate change (McAfee et al., 19 Dec 2025):

  • Ensemble Disagreement and Predictive Error: Epistemic uncertainty, as measured by EU, closely tracks deterioration in mean NLL—both increasing sharply beginning near 2040 under SSP3-7.0 forcing, signaling predictive error associated with climate-change-induced distributional shift. Ensembles trained only on pre-shift data (to 1949 or 2024) exhibit far greater increases in both EU and NLL than those with training extending through 2098.
  • Aleatoric Uncertainty Under Shift: Unlike EU, mean AU paradoxically decreases at long leads (>10 months) during shifted periods—even as model accuracy worsens. This occurs because AU measures only irreducible data uncertainty, not model confidence, so AU becomes less informative under unfamiliar (shifted) input distribution.
  • Scaling of Ensemble Improvement: The performance gain of ensembles relative to single models (i.e., component-mean NLL minus ensemble-NLL) scales proportionally to EU during covariate shift periods. Thus, deep ensembles provide maximal error mitigation when epistemic uncertainty, as signaled by EU, is greatest.

The large member count in CESM2-LE enables robust cross-validation, clarity in isolating internal climate variability from forced response, and the statistical power necessary for meaningful machine learning–based UQ studies. The ensemble’s structure, variable formulation, and demonstrated capacity to diagnose the effects of warming on predictive skill and UQ are foundational for ongoing research in climate prediction and ENSO dynamics.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Whiteboard

Topic to Video (Beta)

Follow Topic

Get notified by email when new papers are published related to NCAR Large Ensemble Project.

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube