Exo-MIP: CUISINES Exoplanet Climate Model Evaluation
- The research framework systematically standardizes experimental designs and protocols to enable reproducible evaluation of diverse exoplanet climate models.
- It organizes modular exo-MIPs across modeling hierarchies (EBMs, intermediate GCMs, full GCMs) with common data formats for robust model–model and model–observation comparisons.
- The project implements FAIR data management and quantitative benchmarks, enhancing model intercomparisons and improving predictive and observational interpretations.
CUISINES Exoplanet Model Intercomparison Project (exo-MIP) is a research framework designed to systematically evaluate and compare the performance of climate, atmospheric, and retrieval models relevant to exoplanetary science. Developed under the auspices of NASA’s Nexus for Exoplanet System Science (NExSS), CUISINES—Climates Using Interactive Suites of Intercomparisons Nested for Exoplanet Studies—provides standardized experimental designs, protocols, and data management principles to ensure robust, reproducible, and community-writable model intercomparisons across a diversity of planetary regimes and modeling hierarchies (Sohl et al., 2024).
1. Historical Development and Rationale
The formalization of CUISINES emerged from a perceived need within the exoplanet science community to adapt established Earth climate intercomparison strategies (e.g., CMIP, PMIP) for the unique challenges posed by exoplanet modeling. Early efforts, such as THAI (TRAPPIST-1 Habitable Atmosphere Intercomparison), revealed that nominally similar general circulation models (GCMs) could diverge substantially in predictions due to differences in radiative transfer schemes, turbulence parameterizations, or planetary boundary conditions (Turbet et al., 2021). In 2021, the BUFFET workshops consolidated these directions, establishing CUISINES with five guiding principles: upfront definition of science drivers, inclusive experimental design, realistic project timelines, standardized data products for model–model and model–observation comparisons, and scalable data management (Sohl et al., 2024).
2. Framework Organization and Protocol Principles
CUISINES organizes model intercomparisons as exo-MIPs—modular, protocol-driven projects that span the modeling hierarchy:
- Science Question Definition: Exo-MIPs must articulate explicit research objectives, e.g., the position of the inner habitable zone edge, the effect of clouds on tidally locked climates, or the spread in synthetic observables for JWST targets.
- Experimental Design Maximizing Participation: Hierarchical protocols specify core (mandatory, low-complexity) and optional experiments, facilitating contributions from models with diverse levels of complexity—for instance, EBMs in the FILLET MIP, RCEs in COD-ACCRA, and full GCMs in CAMEMBERT and THAI (Sohl et al., 2024, Christie et al., 2022, Deitrick et al., 2023).
- Community Engagement and Timeline: Extensive outreach, planning workshops, and public preprints enable broad participation across research groups, supporting flexible milestones and publication strategies.
- Standardized Data Products and Comparison: All exo-MIPs require climate diagnostics in CF-compliant NetCDF formats, instrument-convolved spectra, and metadata-driven summary files to facilitate model–model and model–data analysis.
- Scalable FAIR Data Management: CUISINES employs open repositories (CKAN, Zenodo, GitHub) and rich metadata to ensure that datasets are findable, accessible, interoperable, and reusable (Sohl et al., 2024).
3. Exo-MIP Methodologies and Experimental Protocols
Each exo-MIP under the CUISINES umbrella defines protocols tailored to its science goals and modeling domain. Examples include:
- FILLET v1.1 (Latitudinal EBM Intercomparison): Benchmarks Earth-like climates, then sweeps obliquity, instellation, and CO₂, mapping bifurcation structure. The governing equation is
with protocol benchmarks specifying increments and outputs; ensemble diagnostics yield mean and standard deviation for global temperature, OLR, and ice edges (Barnes et al., 15 Nov 2025).
- SAMOSA (Sparse GCM Parameter Space Sampling): Employs a Sobol-sequence sparse grid in instellation–surface pressure space for a synchronous rotator, comparing intermediate ExoPlaSim runs against ExoCAM to delineate stable, snowball, and runaway greenhouse regimes (Haqq-Misra et al., 2022).
- CAMEMBERT (Mini-Neptune GCM Intercomparison): Uses three cases (Newtonian-forced, dual-grey RT, multi-band RT) applied to GJ 1214b and K2-18b, enforcing shared boundaries (composition, pressure grid, input spectra) and data formats for eight GCMs; outputs include TOA fluxes, 3D fields, and synthetic JWST spectra (Christie et al., 2022).
- THAI (TRAPPIST-1e GCM Spread): Conducts dry and moist atmospheric simulations with four GCMs, fixing planetary parameters and RT setups, quantifying variance in surface temperature, radiative flux, and wind fields (Turbet et al., 2021).
Standardized diagnostics encompass global means, pointwise differences, RMSE, bias, and correlation, with reproducible scripts shared among participants. Measurement of inter-model spread and identification of outlier models are core analysis tasks.
4. Model Hierarchy, Computational Strategy, and Statistical Synthesis
CUISINES accommodates the entire hierarchy of climate and atmospheric models:
- Energy Balance Models (EBMs): Fast, tractable for large parameter sweeps, used for ensemble means and bifurcation mapping (FILLET).
- Intermediate Complexity GCMs: ExoPlaSim combines spectral dynamics with gray (multi-band) radiative transfer, validated against high-fidelity models for qualitative trends (Paradise et al., 2021).
- Full GCMs: High-resolution spectral radiative transfer (e.g., ExoCAM with HITRAN, LMD-G with sub-Lorentzian CO₂ wings), detailed cloud microphysics, non-hydrostatic dynamical cores (Yang et al., 2019, Sergeev et al., 2023).
Protocols include cross-model benchmarks and, where computational cost prohibits dense grids, sparse sampling and geostatistical synthesis. Ordinary and universal kriging, with variogram-based covariances, enable interpolation and surrogate construction for expensive simulations. For ordinary kriging, weights and predictor are found by solving
where is the variogram function (Haqq-Misra et al., 2024). Universal kriging incorporates fast model outputs as drift terms, providing co-synthesized emulators for sparse high-fidelity and dense low-fidelity runs.
5. Quantitative Findings, Intermodel Spread, and Implications
CUISINES exo-MIPs document both robust and model-dependent outcomes across climate metrics and observables:
- THAI Ben 1/Ben 2: Four GCMs yield global mean for TRAPPIST-1e within 7 K (N₂ case) and 6 K (CO₂ case); radiative flux agreement to within 5% (Turbet et al., 2021).
- FILLET: Ensemble approach exposes sensitivity (e.g., ice edge locations, hysteresis width) to parameterizations of OLR, meridional diffusion, and albedo, advising caution for single-EBM habitability estimates (Barnes et al., 15 Nov 2025).
- SAMOSA: ExoPlaSim and ExoCAM agree for snowball and icy regimes; full-physics GCMs are required for precise diagnosis of moist-greenhouse thresholds and runaway greenhouse, where intermediate GCMs fail numerically (Haqq-Misra et al., 2022).
- Water Vapor/Cloud MIP: Spread in mean surface temperature is as small as 8 K for rapidly rotating, G-star cases but up to 26 K for tidally locked, M-star planets, with differences driven by radiative transfer spectral resolution, cloud treatments, and moisture feedback (Yang et al., 2019).
Such variance in predicted climates and observables maps to uncertainty in synthetic JWST transit and emission signatures. Intercomparison results propagate through retrieval ensembles (RISOTTO) to quantify errors in inferred planetary properties.
6. Best Practices, Data Management, and Pathways for Future Integration
All CUISINES exo-MIPs enforce FAIR data management policies—open repositories (CKAN, Zenodo), standardized NetCDF outputs, rich metadata, and persistent DOIs (Sohl et al., 2024). Diagnostics and experiment summaries are cataloged for long-term accessibility and reproducibility.
Best practice recommendations from cumulative exo-MIP experience include: rigid definition of boundary conditions, use of common line lists (e.g., HITRAN, ExoMol) and spectral files, flexible experiment opt-ins for new model entrants, and integration of synthetic observables for translating climate predictions into observational consequences.
Prospective exo-MIPs are encouraged to draft explicit science drivers, convene planning workshops, publish protocols, and contribute to ensemble statistics for key planetary metrics. By implementing a transparent, scalable framework grounded in rigorous protocol and multilevel model synthesis, CUISINES provides the technical and organizational infrastructure necessary for robust, community-wide evaluation of exoplanet model predictions—positioning the field for rapid progress as direct planetary observations accelerate.