Radio Galaxy Foundation Model

Updated 22 September 2025

Radio Galaxy Foundation Model is a comprehensive framework integrating physical models, semi-analytic simulations, and deep learning techniques to analyze radio galaxy dynamics and morphology.
It employs self-supervised neural networks and simulation-based prescriptions to accurately capture jet dynamics, environmental interactions, and radiative evolution.
The model advances astrophysical research by unifying theoretical, computational, and statistical methods to enable scalable, automated classification in large survey data.

The Radio Galaxy Foundation Model encompasses the theoretical, empirical, and machine learning frameworks developed for the rigorous analysis, simulation, and automated classification of radio galaxies. Current research spans traditional physics-based dynamical models, semi-analytic simulation frameworks, and deep neural network foundation models pre-trained with self-supervised algorithms, all with a view to modeling and interpreting the diverse morphologies and evolution of radio galaxies in wide-area surveys. This model underpins both the astrophysical understanding of radio galaxy formation, environmental drivers, and feedback mechanisms, as well as the construction of scalable computational tools for large-scale astronomical data processing.

1. Physical Frameworks for Radio Galaxy Dynamics and Morphology

Classic physical models capture the evolution of radio galaxy structures based on jet dynamics, accretion state transitions, environmental influences, and episodic feedback. Hydrodynamic and magnetohydrodynamic models (e.g., (Hodges-Kluck et al., 2011, Lalakos et al., 2022)) simulate the propagation of relativistic jets from an AGN into an external medium that may be anisotropic (elliptical) and inhomogeneous. These simulations utilize conservation laws for mass, momentum, and energy, solving for lobe expansion, back-flow dynamics, and pressure-driven channel formation. For example, in hydrodynamic models:

The radio lobe/wing expansion is governed by pressure gradients in the ambient atmosphere:

$\frac{\partial p}{\partial x} = -\frac{3}{2} \frac{c_s^2}{\gamma} \rho_0 r_{0,\text{eff}}^{3/2} \frac{x}{(x^2 + r_{0,\text{eff}}^2)^{7/4}}$

where $r_{0,\text{eff}}$ is the effective core radius (a function of ellipticity), $c_s$ is the sound speed, $\gamma$ the adiabatic index, and $\rho_0$ central density.

Jet intermittency and decaying jet power (e.g., $v_\text{jet} \sim 100c_s e^{-3t}$ ) can stall the lobe head, allowing buoyant wing expansion and producing X-shaped source morphology naturally in dense, highly elliptical environments.

Alternative dynamical models consider slingshot mechanisms (Muthumeenal et al., 2010), where gravitational interactions in the AGN promote ejection of massive objects which trace orbits described by:

$\frac{r}{r_0} = \sqrt{2.89 - 2.4\frac{t}{t_0} - \left(\frac{t}{t_0}\right)^2} - 0.7$

providing a deterministic relation between ejection properties and observable structures.

These frameworks unify many observed features (double-lobed structures, wings, morphological asymmetries) as emergent outcomes of jet-environment interplay.

2. Semi-Analytic Simulation-Based Models

The simulation-based analytic model (Hardcastle, 2018) offers a semi-analytic prescription for the dynamical and radiative evolution of powerful FRII-type radio galaxies, incorporating environmental effects and physical parameters derived from hydrodynamic simulations. The model eschews simplifying assumptions of self-similar growth, instead utilizing coupled ODEs for lobe expansion and energetics:

Internal lobe pressure evolution: $p = \frac{(1+\zeta+\kappa)U_e}{3}$
Magnetic field estimation: $B = \sqrt{\frac{2\mu_0 3p\zeta}{1+\zeta+\kappa}}$ where $\zeta$ sets the electron-to-magnetic energy ratio and $\kappa$ the non-radiating particle ratio.

Radiative losses (synchrotron, inverse-Compton) and the passage to "remnant" stages post-jet switch-off yield predictions for spectral index curvature, luminosity evolution, and jet power-radio luminosity correlations. The model is calibrated against simulation results and observational surveys, showing deviations of ~20% in key physical quantities late in evolution. The public Python implementation (https://github.com/mhardcastle/analytic) enables population synthesis and direct comparison with large survey data.

3. Environmental Drivers and Statistical Foundations

Environmental properties play a key role in regulating radio galaxy morphology, as established by statistical studies correlating radio morphology with cluster richness, dynamical states, and redshift distributions (Wing et al., 2010). Cluster environments, characterized by high richness parameter $N_{1.0}^{-19}$ , are shown to preferentially harbor FR I sources and bent-double morphologies:

Bent sources serve as efficient tracers of rich environments (~78% in clusters or groups).
Richness estimation utilizes background-corrected counts and a Schechter function correction for incompleteness:

$f_c = \frac{\phi(M_{r}=-19)}{\phi(M_{r,\text{lim}})}$

Redshift trends indicate increasing prevalence of cluster-associated radio sources at $z \gtrsim 0.5$ . Statistical matching and background subtraction techniques underpin robust identification of cluster associations, despite selection and projection effects. These environmental parameters, along with intrinsic AGN properties (jet power, accretion mode), must be integrated for any foundational model of radio galaxy formation.

4. Machine Learning Foundation Models for Automated Morphology Classification

Deep learning-based morphological classification leverages large foundation models, pre-trained on unlabeled radio images and fine-tuned with labeled subsets (Slijepcevic et al., 2023, Buatthaisong et al., 15 Sep 2025, Lastufka et al., 17 Sep 2024). Methodologies center on self-supervised learning paradigms, such as BYOL, wherein dual networks learn latent representations invariant to instance-wise augmentations.

Backbone architectures are typically deep CNNs (e.g., ResNet variants) or Vision Transformers; projection and prediction heads enable latent feature compression.
Classification heads (MLP, linear classifiers, or whitening layers) are appended for supervised fine-tuning on small, expert-labeled datasets.
Performance metrics include reduced test error rates (half those of supervised baselines in label-scarce regimes) and robust generalization across surveys (MIGHTEE, RGZ, MiraBest).

The learned latent spaces can be visualized (PCA/UMAP), revealing clustering by angular source extent and morphological type, which has enabled similarity search and identification of rare or hybrid morphologies (Walmsley et al., 2023). Tools such as vote fraction (VF) elucidate ambiguity in classification, especially in regions of overlap between classical FRI/FRII boundaries.

5. Model Limitations, Sensitivity to Data Selection, and Interpretability

Research demonstrates that classification confidence and latent feature geometry are sensitive to foundational dataset choices and downstream fine-tuning (Buatthaisong et al., 15 Sep 2025, Tang et al., 2023). Ambiguity in overlapping luminosity-size regions leads to decreased confidence, and selection of unresolved/borderline sources in pre-training can propagate uncertainty. While pre-training data variety does not (within tested regimes) introduce systematic generalization biases, fine-tuning remains highly sensitive to label quality and distributional alignment.

Recent work has begun to address interpretability in deep models by employing explanation techniques such as LIME (Tang et al., 2023), which analyze the local impact of features for each classification decision. This allows for cross-validation against expert morphological criteria and opens avenues for hybrid machine-astronomer workflows.

6. Integration with Physical and Empirical Galaxy Models

Contemporary foundation models facilitate integration with multiwavelength empirical frameworks (Gao et al., 12 Dec 2024), assigning radio emission properties to synthetic galaxy populations using probabilistic mappings from SFRs, stellar masses, and AGN occurrence rates:

For SFGs, IR–radio luminosity ratios parameterize the assignment;
For AGNs, probability functions $p(L_{1.4\text{GHz}} | M_+, z)$ (see Section Details) modulate the host activity fraction.

This enables the recovery of observational constraints (radio luminosity functions, source counts) and direct mapping between radio properties and physical galaxy parameters, supporting survey design and population studies in the SKA era.

7. Outstanding Questions and Future Directions

The Radio Galaxy Foundation Model, spanning analytic, simulation, and deep learning domains, continues to evolve. Key outstanding issues include:

Reconciling physical models (e.g., slingshot, hydrodynamic, MAD state) with population synthesis and machine learning–based morphology clustering in large surveys.
Quantifying the effects of data selection, target domain shift, and augmentations in radio–specific deep learning pipelines (Lastufka et al., 17 Sep 2024).
Extending automated classification to probabilistic or continuous morphologies (superseding the simple FRI/FRII dichotomy).
Integration of interpretable AI frameworks for transparent, scientifically-valid model predictions at scale.

These avenues are critical for interpreting the diversity of radio-loud AGN, constraining feedback mechanisms, and constructing unified models of galaxy evolution across cosmic time.