Quijote Suite: Cosmological N-body Simulations
- Quijote is a comprehensive suite of cosmological N-body simulations that samples a seven-dimensional parameter space for large-scale structure studies.
- It employs finite-difference derivative sets, Latin-hypercube ensembles, and primordial non-Gaussianity variants to rigorously probe cosmological observables.
- The suite delivers diverse data products—power spectra, bispectra, covariances, and catalogs—supporting Fisher forecasting and machine learning-based analysis.
The Quijote suite of N-body simulations is the largest ensemble of full cosmological N-body simulations created to date, designed explicitly to map the sensitivity of large-scale structure observables with high fidelity and to support data-hungry machine learning tools for cosmological inference. Encompassing 44,100 simulations across over 7,000 cosmological models, Quijote probes the hyperplane, focusing on both precision quantification of observable information content and enabling the training of likelihood-free inference techniques. Expansion through the Quijote-PNG sub-suite adds variants with controlled primordial non-Gaussianity, supporting joint cosmology and early-universe parameter studies. All simulation outputs, including particle data, derived catalogs, and summary statistics, are publicly released and accessible for broad computational research (Villaescusa-Navarro et al., 2019, Coulton et al., 2022).
1. Cosmological Parameter Coverage and Suite Composition
Quijote systematically samples a seven-dimensional cosmological parameter space: total matter density (), baryon density (), reduced Hubble parameter (), scalar spectral index (), amplitude of linear fluctuations (), sum of neutrino masses (), and dark energy equation of state (). Simulations are organized into several categories:
- Fiducial Model: Planck 2018 best fit parameters; , , , , , , .
- Finite-Difference Derivative Sets: Pairs with and perturbations for each parameter, with typical fractional shifts: 2–5% depending on parameter. For derivatives—fiducial value zero—unilateral differences at eV are employed with higher-order finite-difference formulas.
- Latin-Hypercube Ensembles:
- LH: 6,000 realizations sampling 6D space with , ;
- LH : 5,000 realizations sampling the full 7D range, eV, .
The Quijote-PNG suite appends 4,000 simulations with primordial non-Gaussianity (PNG) in four shapes (local, equilateral, and two orthogonal variants), with , supplementing the fiducial cosmology for joint PNG and CDM analysis (Coulton et al., 2022).
2. Simulation Specifications
Each Quijote simulation evolves -body particles in a periodic, cubic volume of side , with mass resolutions of , , or particles per box. The total simulated volume exceeds , containing over particles at a single redshift. The majority of simulations use particles per species, with neutrino particles included for nonzero scenarios. Computation employed more than core hours on supercomputing facilities (Villaescusa-Navarro et al., 2019).
The Quijote-PNG extension utilizes particles per box with a mass resolution of . Initial conditions are generated at via 2LPT (second-order Lagrangian perturbation theory), employing Gadget-3 TreePM gravity for evolution (Coulton et al., 2022).
Snapshots are recorded at , preserving position, velocity, and unique particle IDs.
3. Data Products and Analysis Outputs
As summarized below, the Quijote and Quijote-PNG suites provide a comprehensive set of data products for both direct simulation analysis and summary-level inference.
| Product Type | Description | Format |
|---|---|---|
| Particle Snapshots | Positions/velocities/IDs at 6 redshifts per sim (Gadget-II/HDF5) | Gadget-binary/HDF5 |
| Halo Catalogs | FoF halos (), , | Custom binary/HDF5 |
| Void Catalogs | Centers, radii, masses; Banerjee & Dalal algorithm | HDF5 |
| Power Spectra | for matter/halos, real and redshift space (mono/quad/hexadecapoles) | ASCII/HDF5 |
| Bispectra | FFT-based estimators for matter/halos | ASCII/HDF5 |
| PDFs | One-point PDFs for matter, CDM, halos; Gaussian smoothing | ASCII/HDF5 |
| Marked Statistics | 125 variants of marked power spectra, 2-pt functions, mark histograms | ASCII/HDF5 |
| Covariances | Full covariance of , , and joint probes | HDF5 |
Statistics are computed using Pylians analysis libraries and custom FFT-based estimators, with full provenance of measurement pipeline available. Additional outputs include super-sample responses, void size functions, and halo mass functions. Data for Quijote and Quijote-PNG are organized by parameter block, redshift, and statistic, and are openly accessible via GitHub and readthedocs repositories (Villaescusa-Navarro et al., 2019, Coulton et al., 2022).
4. Measurement and Covariance Methodology
Field assignment utilizes cloud-in-cell (CIC) methods on grids, ensuring negligible window function aliasing for . Power spectra estimators average over modes in thin -bins; bispectra are measured via brute-force summation over FFT triplets constrained to , with bin width , .
Covariance modeling includes:
- Gaussian Covariance (): Diagonal, calculated via modes counting.
- Non-Gaussian Connected Part (): Measured from 12,500 Quijote realizations, accounting for mode-coupling.
- Super-Sample Covariance (SSC, ): Determined by the separate-universe approach, estimating responses , , where is the mean density fluctuation across the simulated box. This is essential for , where off-diagonal and SSC terms can dominate the error budget (Coulton et al., 2022).
5. Applications and Analytical Utility
The Quijote suite serves as a foundation for several key research applications:
- Fisher Matrix Forecasting: Numerical derivatives with phase-matched seeds enable minimization of sample variance in estimates, essential for Fisher matrix analyses of cosmological parameter constraints:
With denoting observables such as , ; covariance includes all measured and SSC terms. Notably, information in saturates at due to nonlinear mode coupling and parameter degeneracies (Villaescusa-Navarro et al., 2019), echoed in PNG constraints where joint and break degeneracies with diminishing returns beyond (Coulton et al., 2022).
- Machine Learning and Emulators: Quijote provides sufficient volume for robust training of likelihood-free inference engines, such as random forests and information-maximizing neural nets (IMNN). Super-resolution and forward-modeling convolutional networks have been trained to bridge low- and high-resolution observables, and paired-fixed simulations accelerate such training by reducing cosmic variance (Villaescusa-Navarro et al., 2019).
- Non-Gaussian Statistics: Applications include wavelet scattering transforms (WST), phase harmonics, and higher-order field statistics to encode non-Gaussian information absent from traditional summary statistics.
- Primordial Non-Gaussianity and Joint Inference: Quijote-PNG supports derivation of response derivatives in multiple directions, allowing building and validation of optimal estimators for joint CDM and PNG parameter recovery, at accuracy required for upcoming LSS surveys (Coulton et al., 2022).
6. Data Access, Public Resources, and Analysis Tools
All primary snapshots, catalogs, summary statistics, and covariances from the Quijote and Quijote-PNG suites are available with detailed documentation and code notebooks. Data are browsable and downloadable via:
- https://github.com/franciscovillaescusa/Quijote-simulations
- https://quijote-simulations.readthedocs.io/en/latest/png.html
The Pylians3 Python library supports I/O and analysis, including sample usage for reading and computing summary statistics. For customized initial conditions with controlled PNG, the 2LPTPNG code is provided. In addition, public Jupyter notebooks facilitate rebuilding of spectra, bispectra, covariances, and Fisher analyses using the released data (Villaescusa-Navarro et al., 2019, Coulton et al., 2022).
7. Context within Large-Scale Structure Research
The scale, diversity, and openness of Quijote make it a fundamental resource for LSS precision cosmology, machine learning structure inference, and PNG detection methodology. The combination of derivative blocks, high-volume covariance sets, latin-hypercube coverage, and controlled non-Gaussian ICs is unique among public N-body suites.
A plausible implication is that, given the convergence of and information at moderate , future advances in cosmological parameter inference will depend critically on improved modeling of covariance—including SSC—and the synthesis of non-Gaussian statistics and machine learning–driven summary construction enabled by Quijote's scale and design (Villaescusa-Navarro et al., 2019, Coulton et al., 2022).