Chempy: Galactic Chemical Evolution Model

Updated 23 January 2026

Chempy is a parametrized, one-zone model that simulates star formation, chemical enrichment, and ISM dynamics in galaxies.
It leverages Bayesian inference and neural network emulators for rapid and rigorous statistical fitting of stellar abundance data.
The framework flexibly incorporates diverse nucleosynthetic yield tables and key parameters like the high-mass IMF slope and SN Ia rates.

Chempy is a parametrized, open one-zone Galactic Chemical Evolution (GCE) model now widely used as a benchmark tool to interpret and statistically fit galactic elemental abundance data in stellar archaeology. Chempy enables forward and inverse modeling of the chemical enrichment history of galaxies, emphasizing computational efficiency, rigorous Bayesian inference, and flexibility in the choice of nucleosynthetic yield tables, star formation histories, and ISM physics. By combining rapid integrators with neural network emulators, Chempy forms the backbone for scalable Bayesian or simulation-based inference (SBI) pipelines, supporting quantitative constraints on fundamental parameters such as the high-mass slope of the initial mass function (IMF) and the normalization of Type Ia supernova (SN Ia) rates. This makes Chempy a central framework for analyses leveraging large spectroscopic and astrometric surveys (Rybizki et al., 2017, Philcox et al., 2019, Buck et al., 4 Mar 2025).

1. Chemical Evolution Model: Structure and Governing Equations

Chempy adopts a classical one-zone "leaky box" model where the total ISM gas mass $M_\mathrm{gas}(t)$ and the mass fractions $X_i(t)$ of each tracked element $i$ evolve according to star formation, mass return from dying stars, inflow, and outflow (Rybizki et al., 2017, Rybizki, 2018, Philcox et al., 2019). For each element $i$ , the main evolution equation is:

$\frac{d[M_g X_i]}{dt} = -\psi(t) X_i + E_i(t) + \Gamma_\mathrm{in}(t)X_{i,\mathrm{in}} - \Gamma_\mathrm{out}(t) X_i$

where:

$\psi(t)$ : star formation rate (SFR)
$E_i(t)$ : mass return in element $i$ from stellar ejecta (includes CC SNe, SNe Ia, AGB stars)
$\Gamma_\mathrm{in}(t)$ : inflow rate (e.g., primoridal or enriched gas)
$\Gamma_\mathrm{out}(t)$ : outflow rate (e.g., stellar feedback)

Stellar returns are computed as convolutions over the IMF $\phi(m)$ and stellar lifetimes $\tau(m)$ , split into core-collapse SNe, SNe Ia (with a delay-time distribution, DTD), and AGB channels. The SN Ia rate is:

$R_\mathrm{Ia}(t) = N_\mathrm{Ia} \int_{\tau_\mathrm{min}}^t \psi(t - \tau) \mathrm{DTD}(\tau) d\tau$

with typical DTD $\propto t^{-1}$ (Buck et al., 4 Mar 2025).

Star formation is governed by a Schmidt law,

$\psi(t) = \mathrm{SFE} \times M_\mathrm{gas}(t)$

and the SFR history can be a Gamma-law with shape $k=2$ and peak at $\mathrm{SFR}_{\rm peak}$ (Philcox et al., 2017, Horta et al., 2021).

2. Parameter Space: SSP, ISM, and Priors

Chempy parametrizes both stellar and ISM physics for inference flexibility. The key free parameters fall into two groups (Rybizki et al., 2017, Buck et al., 4 Mar 2025, Philcox et al., 2019):

Global Simple Stellar Population (SSP) Parameters:
- $\alpha_\mathrm{IMF}$ : high-mass IMF slope (Chabrier/Kroupa form; prior $\mathcal{N}(-2.3, 0.3)$ )
- $\log_{10}(N_\mathrm{Ia})$ : SN Ia normalization, i.e., the number of SNe Ia per $M_\odot$ formed in 15 Gyr (prior $\mathcal{N}(-2.89, 0.3)$ )
Local ISM Parameters (may vary star-to-star, bin-to-bin, or zone-to-zone):
- $\log_{10}(\mathrm{SFE})$ : star formation efficiency (prior $\mathcal{N}(-0.3, 0.3)$ )
- $\log_{10}(\mathrm{SFR}_\mathrm{peak})$ : peak epoch of SFR (prior $\mathcal{N}(0.55, 0.1)$ )
- $x_\mathrm{out}$ : outflow fraction of stellar ejecta (prior $\mathcal{N}(0.5, 0.1)$ )
- $T_i$ : stellar formation time (e.g., uniform $[1, 13.8]$ Gyr)

The influence of these parameters is highly interpretable: increasing $\alpha_\mathrm{IMF}$ steepness favors a bottom-heavy IMF (fewer massive stars, less $\alpha$ -enrichment), $N_\mathrm{Ia}$ tunes iron production and $[\alpha/\mathrm{Fe}]$ decline, high SFE drives rapid enrichment, and $x_\mathrm{out}$ controls metal retention (Rybizki et al., 2017, Philcox et al., 2017).

Bayesian or hierarchical frameworks allow each parameter to float independently, with Gaussian priors (derived from literature or population synthesis) and hard bounds for physicality (Rybizki et al., 2017, Rybizki, 2018, Buck et al., 4 Mar 2025).

3. Nucleosynthetic Yields and Physical Inputs

Chempy’s accuracy is underpinned by explicit choices of nucleosynthetic yields for each enrichment channel (Rybizki et al., 2017, Philcox et al., 2017, Blancato et al., 2019). Supported yield tables include:

Core-Collapse SNe: Nomoto et al. (2013), Chieffi & Limongi (2004), Prantzos et al. (2018, rotating stars), Kobayashi et al. (2006), Ritter et al. (2017).
Type Ia SNe: Seitenzahl et al. (2013), Thielemann et al. (2003), Iwamoto et al. (1999; W7 model), matched to a DTD (commonly $t^{-1.1}$ or $t^{-1}$ for $t > \tau_\mathrm{min}$ ).
AGB Winds: Karakas (2010), Ventura (2013), Pignatari (2013).

Yield tables are pre-tabulated over mass and metallicity, and Chempy applies linear interpolation and, when supported, switches between net and gross yields (Philcox et al., 2017, Rybizki, 2018). Yield choice is treated as a hyperparameter, and the model can marginalize or optimize over competing tables, quantifying the impact of yield uncertainties directly in the posterior (Philcox et al., 2017). Statistical scoring systems (Bayesian evidence, LOO-CV) enable yield table selection based on proto-solar (or other) abundance reproduction.

4. Inference Machinery: Bayesian Methods and Neural Emulation

Chempy was designed for full Bayesian parameter inference, utilizing Gaussian likelihoods for observed vs. modeled abundances, ages, or other constraints, augmented by model error parameters to capture systematic deficiencies (Rybizki et al., 2017, Philcox et al., 2019). The log-posterior is:

$\log P(\theta|O, \lambda) = \log P(\theta) + \log L(O|d(\theta, \lambda))$

where $\theta$ represents all free parameters, $\lambda$ specifies the yield table, and $O$ , $d(\theta, \lambda)$ denote observed and modeled data, respectively. MCMC sampling uses affine-invariant (emcee) or, for higher efficiency, Hamiltonian Monte Carlo (HMC; e.g., PyMC3) (Philcox et al., 2019).

Due to the cost of repeated Chempy ODE integrations, neural network emulators are trained to replace forward model evaluations, reducing run times from seconds per call to $\ll$ ms. Architectures typically involve a 6-dimensional input (model parameters), two hidden layers, and output the predicted set of $[X/Fe]$ and $[Fe/H]$ values, achieving median errors well below the observational scatter (e.g., $\sim 0.005$ dex) (Buck et al., 4 Mar 2025, Philcox et al., 2019). Emulators are trained on $10^5$ – $10^6$ Chempy runs sampled across the prior.

Recent advances embed Chempy within simulation-based inference (SBI) frameworks using normalizing flows (e.g., Masked Autoregressive Flows, MAF) as Neural Posterior Estimators (NPEs), enabling amortized inference for $\gtrsim 10^3$ stars in seconds (Buck et al., 4 Mar 2025). The SBI approach enables a $\sim75,600\times$ speed-up compared to HMC for a comparable inference task, with posterior errors at the $\lesssim0.05\%$ level (Buck et al., 4 Mar 2025).

5. Applications and Empirical Results

Chempy supports diverse applications in Galactic archaeology and model validation:

Solar and Local Constraints: Fitting only protosolar abundances and a few additional constraints tightly shrinks posteriors on $\alpha_\mathrm{IMF}$ and $N_\mathrm{Ia}$ ( $\alpha_\mathrm{IMF} \simeq -2.42 \pm 0.06$ ; $N_\mathrm{Ia} = 0.5-1.4$ per $10^3 M_\odot$ formed), with multi-zone extensions enabling joint fits to the Sun, Arcturus, and local B-stars (Rybizki et al., 2017).
Survey-Scale Fitting: Analysis of up to $1,000$ stars, using mock data or catalog stars (e.g., APOGEE, GALAH), achieves precise recovery of input parameters, robust to moderate yield mis-specification. Illustration: For Chempy-generated and IllustrisTNG-simulated data, HMC or SBI recovers $\alpha_\mathrm{IMF}$ and $\log_{10}(N_\mathrm{Ia})$ within $<0.05\%$ of the ground truth (Buck et al., 4 Mar 2025, Philcox et al., 2019).
Spatial Gradients: Fitting in bins of $[\mathrm{Fe/H}]$ , $[\mathrm{Mg/Fe}]$ , and age across the disk reveals systematic variations in the high-mass IMF slope (steeper in the outer disk), as well as mild variation in $N_\mathrm{Ia}$ and SFE (Horta et al., 2021).
Yield Table Discrimination: Bayesian evidence and LOO-CV scores rank Prantzos et al. (2018, with rotation) and Chieffi & Limongi (2004) as best fitting proto-solar abundances among modern CC SN tables (Philcox et al., 2017).
Hydrodynamical Simulation Calibration: Plug-in of Chempy-inferred SSP parameters into cosmological hydrodynamical simulations (e.g., AREPO) dramatically improves alignment of simulated abundance distributions with real data, correcting systematic offsets in $[\alpha/\mathrm{Fe}]$ tracks (Philcox et al., 2017).

6. Limitations and Systematic Uncertainties

Several sources of systematic uncertainty and model limitation are recurrently identified:

Yield Table Sensitivity: Inaccurate or incomplete yields (e.g., for Si, K, Ti, V) cannot be compensated by parameter variation alone; systematic element-level discrepancies persist at the $\gtrsim 0.1$ dex level across all parameter settings and yield choices (Rybizki et al., 2017, Blancato et al., 2019).
One-Zone Assumption: Radial flows, multi-zone ISM structure, and migration effects are not modeled natively; fits are generally performed in small abundance-age bins to minimize this limitation, but model misspecification remains (Horta et al., 2021).
High-Dimensional Degeneracies: Abundance data for only a few elements and stars leaves substantial degeneracy—especially for $\mathrm{SFE}$ , $\mathrm{SFR}_{\rm peak}$ , and outflow fraction—while $\alpha_\mathrm{IMF}$ and $N_\mathrm{Ia}$ are best constrained (Philcox et al., 2019).
Model Error Term: Explicit modeling of element-dependent and global model errors (e.g., $\sigma_\mathrm{model}^j$ or $\sigma_{\rm m}$ ) is necessary, both to prevent mis-calibration from overconfident posteriors and to correctly handle yield/data discrepancies (Philcox et al., 2019, Philcox et al., 2017).
Calls for Enhanced Frameworks: Persistent discrepancies between model and spectroscopic survey data motivate the development of data-driven yield calibration frameworks and multi-zone chemical evolution models for next-generation inference (Blancato et al., 2019).

7. Practical Usage, Workflow, and Reproducibility

Chempy is modular, open source, and can be rapidly adapted to different scientific use cases (Philcox et al., 2017):

Typical Workflow:

Define yield tables and elemental set.
Specify priors for global and local parameters (SSP, ISM, times).
Generate training/simulation sets ( $10^5$ – $10^6$ samples) for emulator or NPE (add observational noise).
Train neural emulator (unless public weights suffice).
Train NPE on (data, parameter) pairs; validate via simulation-based calibration (SBC, TARP).
For observed data: evaluate NPE to obtain the posterior $p(\theta|D)$ per star/bin.
Combine posteriors (with appropriate correction for prior "over-counting") or refit collective posteriors as a multivariate Gaussian (Buck et al., 4 Mar 2025, Philcox et al., 2019).

Dependencies and Implementation: Python 3, numpy, scipy, PyTorch, "sbi" package (for SBI/flows), emcee or PyMC3 for MCMC/HMC, scikit-monaco for evidence. Tutorials and public codebases (e.g., https://github.com/oliverphilcox/ChempyScoring) provide detailed practical guidance (Philcox et al., 2017).
Best Practices and Recommendations:
- Careful selection of yield tables and elements; ensure elements are robustly predicted by existing yields.
- Always perform calibration checks (SBC, TARP) before inference on real data.
- Monitor and quantify degeneracies, especially $\alpha_\mathrm{IMF}$ vs. $N_\mathrm{Ia}$ anti-correlations.
- Use sequential methods or hierarchical modeling to scale to extensive photometric/kinematic samples (Buck et al., 4 Mar 2025).

Chempy, in combination with emulator-based SBI, forms a foundational architecture for Milky Way chemical evolution studies, allowing large-scale, statistically rigorous parameter inference, yield calibration, and even feedback into cosmological simulation subgrid prescriptions (Buck et al., 4 Mar 2025, Philcox et al., 2017).