Papers
Topics
Authors
Recent
2000 character limit reached

COLA-f: Cosmological Simulations & Conformal Prediction

Updated 19 November 2025
  • COLA-f is a dual-purpose method: in cosmology, it simulates scale-dependent structure growth for modified gravity, while in statistics it provides exact predictive inference via full-conformal allocation.
  • In cosmological simulations, COLA-f combines k-dependent Lagrangian perturbation theory with an efficient screening solver to reproduce matter power spectra and halo mass functions within 1–2% accuracy at a fraction of full N-body computational cost.
  • In predictive inference, COLA-f optimally allocates miscoverage to ensure exact finite-sample coverage, though its steep computational scaling restricts its use to smaller datasets or few classification labels.

The term COLA-f designates two distinct but unrelated advanced computational methods in contemporary research literature: (1) COLA-f in cosmological simulations, where it refers to the COmoving Lagrangian Acceleration method with scale-dependent growth and approximate modified gravity screening for efficient large-scale structure modeling; and (2) COLA-f in the context of conformal prediction, where it refers to the full-conformal α-allocation variant in predictive inference. Both domains employ the label COLA-f for methods involving the allocation or decomposition of key simulation or statistical resources to maximize accuracy and efficiency under practical constraints.

1. COLA-f in Cosmological Simulations: Scale-Dependent Growth with Screening

COLA-f, as introduced in "COLA with scale-dependent growth: applications to screened modified gravity models" (Winther et al., 2017), is a parallelized code that extends the COmoving Lagrangian Acceleration (COLA) formalism to simulations of cosmic structure formation in modified gravity models exhibiting scale-dependent linear and second-order growth. The primary innovation is integration of kk-dependent 1LPT/2LPT displacements combined with an approximate, efficient screening solver, making precision structure formation calculations tractable even for non-standard gravity scenarios such as f(R)f(R) gravity and nDGP.

In the COLA framework, particle positions are split as

x(q,τ)=q+ΨLPT(q,τ)+δx(q,τ)x(q, \tau) = q + \Psi_{\rm LPT}(q, \tau) + \delta x(q, \tau)

where ΨLPT\Psi_{\rm LPT} contains 2nd-order Lagrangian perturbation theory (2LPT) displacements and δx\delta x is the non-perturbative residual evolved by a low-cost N-body mesh integration. For models with scale-dependent linear growth, both 1LPT and 2LPT displacements are kk-dependent; the corresponding ODEs are solved on a grid of wavenumbers, and the structure formation is advanced with few time-steps via FFT-based operations. For f(R)f(R)-gravity, screening enters through a chameleon-type field equation, with a fast screening factor applied to suppress the fifth force in high-density regions.

Numerical validation against full N-body simulations in f(R)f(R) and nDGP shows that COLA-f reproduces key observables—the matter power spectrum and halo mass function boosts—at the $1$–2%2\% level up to k3h/k\sim3\,h/Mpc, for O(1030)\mathcal{O}(10-30) time steps, yet at 100×\sim 100 \times lower computational cost than full-resolution N-body codes. This technical achievement enables routine execution of precision large ensemble simulations required for survey covariance, emulator construction, and mock galaxy catalog generation in modified gravity cosmologies (Winther et al., 2017).

2. Theoretical Foundations: Modified Gravity, LPT, and Screening

COLA-f is constructed for models where the linear growth function is kk-dependent, typically due to scale-dependent modifications of gravity. Examples include f(R)f(R) gravity, characterized in the Hu–Sawicki form by an extra scalaron degree of freedom, whose mass m(a)m(a) determines the range of the fifth force:

μ(k,a)=1+13k2k2+a2m2(a)\mu(k,a) = 1 + \frac{1}{3} \frac{k^2}{k^2 + a^2 m^2(a)}

Scale dependence propagates into both first- and second-order LPT displacements.

For f(R)f(R), the scalaron field equation and screening are handled by the approximation:

x2ϕ=a2m2(a)ϕ+13κδϵscreen(ΦN)\nabla_x^2 \phi = a^2 m^2(a) \phi + \frac{1}{3} \kappa \delta \cdot \epsilon_{\rm screen}(\Phi_N)

with ϵscreen(ΦN)=min[1,  3fR(a)2ΦN]\epsilon_{\rm screen}(\Phi_N) = \min \left[1,\; |\frac{3 f_R(a)}{2 \Phi_N}| \right], yielding rapid suppression of the modified force where Newtonian potentials are deep, without requiring a full nonlinear multigrid solution.

To render the approach efficient, COLA-f replaces the full 3D integrals of 2LPT with an ansatz for the 2LPT kernel and computes all displacements on FFT grids. As a result, most computational effort is shifted to FFTs, domain decompositions, and local PM operations (Winther et al., 2017).

3. Numerical Scheme, Scalability, and Performance

The COLA-f algorithm is designed for distributed parallelism, dividing the computational domain using MPI. Each time-step involves:

  • Solving ODEs for D1(k,τ)D_1(k,\tau) and the approximate D^2(k,τ)\hat{D}_2(k,\tau),
  • FFT operations for displacement fields,
  • Density assignment and FFT-based gravity solves,
  • Application of the screening factor,
  • Leapfrog integration of the residual displacement and velocity.

Additional memory overhead arises from storing multiple kk-space displacement arrays and per-particle LPT derivatives. Particle exchanges occur when Lagrangian displacements traverse subdomain boundaries, tracked via home-CPU IDs and initial coordinates.

Empirically,

  • 10 time steps provide P(k)(k) and halo mass function accuracy within $2$–5%5\% for k1hk\lesssim1\,h/Mpc.
  • Increasing to 20–30 steps yields percent-level accuracy to k3h/k \sim3\,h/Mpc and across halo masses 101210^{12}1015M/h10^{15}\,M_\odot/h.

When compared to full N-body, COLA-f achieves 100×\sim 100\times speed-ups, with a 3-4×\times slowdown when scale-dependent growth and screening are included relative to standard COLA (Winther et al., 2017).

4. Accuracy, Validation, and Scientific Applications

Benchmarking against high-resolution N-body datasets shows:

  • Matter power spectrum P(k)P(k) boosts for f(R)f(R): accuracy within $1$–2%2\% up to k3h/k \simeq 3\,h/Mpc;
  • Halo mass function ratios within 2%2\% (F5), slightly underestimating for lighter halos in F6 due to screening approximation limitations;
  • Velocity divergence spectra accurate at the $5$–8%8\% level to k2h/k \sim 2\,h/Mpc.
  • For nDGP, P(k)P(k) and halo function boosts are reproduced within 2%2\%.

COLA-f enables the construction of large ensembles of mock galaxy, halo, and dark matter catalogs under f(R)f(R) or other scale-dependent gravity, crucial for covariance estimation, emulator calibration, and data analysis for large-scale structure experiments.

5. COLA-f in Conformal Prediction: Full-Conformal Aggregation

An unrelated, independent usage of COLA-f arises in predictive inference, denoting the full-conformal α-allocation method for constructing conformal prediction sets with exact finite-sample coverage (Xu et al., 15 Nov 2025). In this context, given KK nonconformity scores and a calibration dataset {(Xi,Yi)}i=1n\{(X_i, Y_i)\}_{i=1}^n, COLA-f allocates the total miscoverage α\alpha across the KK sets so as to minimize average prediction set size, recalibrating the allocation vector α(y)\alpha^{(y)} for each possible label yy using the augmented sample of n+1n+1 points.

The COLA-f set for a new input Xn+1X_{n+1} is

C^f(Xn+1;α)={y:yk=1KC^k(y)(Xn+1;αk(y))}\widehat C^{\mathrm f}(X_{n+1};\alpha) =\left\{ y : y \in \bigcap_{k=1}^K \widehat C_k^{(y)}\left(X_{n+1}; \alpha_k^{(y)}\right) \right\}

where each allocation α(y)\alpha^{(y)} is chosen to minimize mean set size, treating the candidate label yy symmetrically with the observed calibration responses. This exact finite-sample symmetry restores marginal coverage 1α1-\alpha, at the expense of O(Y(αn)K1)\mathcal{O}(|\mathcal{Y}| \, (\alpha n)^{K-1}) computational cost, restricting practical use to small nn or small KK.

Table: COLA-f Algorithmic and Empirical Characteristics (Xu et al., 15 Nov 2025)

Feature Implementation Empirical Performance
Coverage Guarantee Finite-sample marginal (1α\ge1-\alpha) Achieved for all nn
Optimality Objective Minimize average set size Shorter sets for small nn
Computational Scaling O(Y(αn)K1)\mathcal{O}(|\mathcal{Y}| (\alpha n)^{K-1}) 71–1793s per test sample for n=100n=100–$300$
Recommended Use Case n200n\le200, K3K\le3 -

COLA-f in this sense is most useful where sample size is small and exact validity is required; for large-scale problems, sample-split (COLA-s) or asymptotic (COLA-e) methods are strongly preferred due to computational intractability (Xu et al., 15 Nov 2025).

6. Limitations and Future Prospects

Cosmological COLA-f:

  • The reliance on linear screening and approximate 2LPT kernels implies underestimation of higher-order statistics' MG signals, specifically in the deeply screened regime and the reduced bispectrum of dark matter (Fiorini et al., 2022).
  • The method does not resolve non-spherical screening or small-scale halo substructure, requiring external calibrations or empirical fits for high-fidelity galaxy/habitat population models (Fiorini et al., 2021).
  • Pushing to strongly non-linear or baryon-dominated (k1h/k\gtrsim 1h/Mpc) scales requires higher mesh resolution, symplectic integrators, or field-level emulation approaches.

Conformal Prediction COLA-f:

  • Exact finite-sample guarantees come at steep computational cost, limiting feasibility to classification problems with few labels or regression problems with coarse label grids and small calibration sets.
  • No non-asymptotic efficiency bounds are currently available; theoretical and algorithmic advances for more efficient full-conformal set aggregation remain open (Xu et al., 15 Nov 2025).

7. Summary of Impact and Usage

In cosmology, COLA-f has enabled percent-level simulation of structure formation in modified gravity models at orders-of-magnitude reduced computational cost, directly supporting mock catalog production, power-spectrum emulation, and forecast analyses for Stage IV galaxy surveys (Winther et al., 2017). In statistics, COLA-f offers a conceptually optimal but computationally intensive solution for combining multiple scoring rules in predictive inference, providing exact marginal coverage (Xu et al., 15 Nov 2025). Future work in both fields aims to relax computational constraints while retaining optimality or accuracy guarantees.

Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to COLA-f.