Classification of Fermi-LAT unidentified gamma-ray sources using CatBoost gradient boosting decision trees
Abstract: The latest $\textit{Fermi}$-LAT gamma-ray catalog, 4FGL-DR3, presents a large fraction of sources without clear association to known counterparts, i.e., unidentified sources (unIDs). In this paper, we aim to classify them using machine learning algorithms, which are trained with the spectral characteristics of associated sources to predict the class of the unID population. With the state-of-the-art $\texttt{CatBoost}$ algorithm, based on gradient boosting decision trees, we are able to reach a 67% accuracy on a 23-class dataset. Removing a single of these classes -- blazars of uncertain type -- increases the accuracy to 81%. If interested only in a binary AGN/pulsar distinction, the model accuracy is boosted up to 99%. Additionally, we perform an unsupervised search among both known and unID population, and try to predict the number of clusters of similar sources, without prior knowledge of their classes. The full code used to perform all calculations is provided as an interactive Python notebook.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.