- The paper introduces an unsupervised k-means clustering approach that identifies coronal holes on full-disk AIA/SDO images with high median IoU and TSS scores.
- The study leverages a composite configuration (2CC using AIA 193 and 211 channels) to enhance detection precision and maintain consistency across varying solar cycles.
- The paper validates results using pixel-wise evaluation metrics against established databases, establishing a scalable benchmark for operational space weather forecasting.
Identification of Coronal Holes on AIA/SDO Images Using Unsupervised Machine Learning
Introduction
The paper focuses on identifying coronal holes (CHs) on the Sun using an unsupervised machine learning approach, specifically the k-means clustering algorithm. Coronal holes are significant solar features linked to the origin of high-speed solar wind streams that impact space weather near Earth. Traditional methods of CH detection involve manual identification or simplistic algorithms, which do not fully capture the dynamic nature of CHs and may lack precision across varying solar conditions. Utilizing the Atmospheric Imaging Assembly (AIA) on the Solar Dynamics Observatory (SDO), the paper aims to apply pixel-wise k-means clustering to images captured in 171 \AA\,, 193 \AA\,, and 211 \AA\, wavelengths.
Methodology
Data Preprocessing
The paper leveraged full-disk solar images from AIA/SDO with a spatial resolution of 4096x4096 pixels (Figure 1).
Figure 1: Passband images of the Sun in 171 \AA\,(the left panel), 193 \AA\,(the middle panel), and 211 \AA\,(the right panel) taken by the AIA/SDO on 8 December 2016 at 00:00 UT.
These images, centered at specific spectral emissions, were subject to rigorous preprocessing steps, including degradation and pointing correction, spatial alignment, and normalization to count/pixel/second. A bimodal Gaussian fit approach identified intensity thresholds in the histogram of each dataset to enhance image contrast (Figure 2).
Figure 2: Probability densities of AIA/SDO 171 \AA (top panel), 193 \AA (middle panel), and 211 \AA (bottom panel) intensities with calculated threshold values indicated.
Clustering Analysis
The k-means clustering technique was applied to segmented datasets including AIA 193, AIA 211, and composite images (2CC and 3CC). The appropriate number of clusters, determined using the scree-plot method, was found to be three, representing dark (CHs), bright (active regions), and intermediate (quiet Sun) regions (Figure 3).
Figure 3: SSD for cluster analysis on 193 \AA\, data, indicating the optimal number of clusters.
Evaluation Metrics
The validity of the clustering results was assessed using pixel-wise evaluation metrics such as Intersection over Union (IoU) and True Skill Statistics (TSS), calculated against reference binary maps from the CATCH database (Figure 4).
Figure 4: IoU and TSS distributions illustrating performance of various configurations compared to CATCH and HEK datasets.
Temporal Analysis
To assess the robustness across solar cycles, a multi-year analysis showed temporal correlations in CH area computations between the proposed methods, HEK dataset, and CATCH maps, reaffirming consistency in results (Figure 5 and Figure 6).
Figure 5: Correlation coefficients of CH areas derived from this paper and HEK against CATCH data.
Results
The paper found that the 2CC configuration, a composite setup using both 193 and 211 \AA\ channels, yielded the best overall performance in terms of overlap and classification accuracy with median IoU and TSS values higher than HEK database figures. This suggests that the k-means based unsupervised method provides an efficient and consistent CH detection framework superior to complex CNN-based methods for the same period.
Comparison and Consistency
Binary maps from different phases of solar cycle 24 were compared, demonstrating consistent detection during varying solar conditions and confirming the reliability of the 2CC method in capturing temporal dynamics of CHs (Figures 8 and 9).
Figure 7: CH binary maps across key dates, showcasing consistency between detection methods and time periods.
Figure 8: A sequence of CH binary maps highlighting temporal evolution over November 2015.
Discussion and Conclusion
This research underscores the suitability of the k-means clustering method for CH detection, emphasizing its adaptability to day-to-day solar image variations and robustness against traditional visual or CNN-based methods. The creation and use of a consensus-driven coronal hole database are crucial for further refinement in predictive solar wind modeling and space weather forecasting. Establishing a reliable and agreed-upon "ground truth" for CH boundaries can significantly enhance supervised learning models' effectiveness, further bridging operations with scientific insights. Overall, this paper suggests promising improvements in automation and precision for operational space weather applications while offering a scalable model for integration into broader heliophysics studies.