Data fusion of complementary data sources using Machine Learning enables higher accuracy Solar Resource Maps (2501.04381v2)
Abstract: In the present work, we collect solar irradiance and atmospheric condition data from several products, obtained from both numerical models (ERA5 and NORA3) and satellite observations (CMSAF-SARAH3). We then train simple supervised Machine Learning (ML) data fusion models, using these products as predictors and direct in-situ Global Horizontal Irradiance (GHI) measurements over Norway as ground-truth. We show that combining these products by applying our trained ML models provides a GHI estimate that is significantly more accurate than that obtained from any product taken individually. Using the trained models, we generate a 30-year ML-corrected map of GHI over Norway, which we release as a new open data product. Our ML-based data fusion methodology could be applied, after suitable training and input data selection, to any geographic area on Earth.