- The paper introduces a hybrid framework that integrates 2.9 million observations to deliver operational, uncertainty-aware risk assessments for Arctic permafrost.
- It employs a stacked ensemble combining Random Forest, Histogram Gradient Boosting, and Elastic Net with rigorous spatiotemporal cross-validation for enhanced prediction accuracy.
- The approach blends machine learning with physics-based constraints to generate reliable scenario projections and actionable infrastructure risk classifications.
Hybrid Physics-ML Framework for Pan-Arctic Permafrost Infrastructure Risk at Record 2.9-Million Observation Scale
Introduction and Motivation
Arctic permafrost degradation, driven by rapid regional warming, poses significant risks to infrastructure valued at over \$100 billion across Northern Russia. Traditional physically based permafrost models, while mechanistically grounded, are limited by sparse observations, high computational cost, and lack of uncertainty quantification. Purely data-driven ML approaches, though scalable, often fail to generalize under extrapolative climate scenarios and are prone to overfitting due to spatiotemporal autocorrelation. This work addresses these limitations by developing a hybrid physics-ML framework, leveraging a record-scale dataset of 2.9 million observations from 171,605 locations (2005–2021), to deliver operational, uncertainty-aware risk assessments for Arctic infrastructure.
Figure 1: Dataset Overview. The spatial and temporal coverage of the 2.9M-observation dataset across Arctic Russia.
Dataset and Exploratory Analysis
The dataset integrates annual permafrost fraction estimates with climate reanalysis variables, providing comprehensive coverage from 60° to 82°N and 30° to 180°E. The permafrost fraction distribution is bimodal, with peaks near 0% (southern margins) and 100% (continuous permafrost), and a long positive tail reflecting transitional zones. The data quality is high, with minimal missing values and physically plausible ranges for all variables.
Spatially, permafrost fraction increases sharply with latitude, especially between 62° and 68°N, corresponding to the discontinuous permafrost zone. Longitudinal gradients reflect physiographic and climatic heterogeneity, with the coldest permafrost in eastern Yakutia. Temporally, mean permafrost fraction declined from 76% in 2005 to 73% in 2021, with the decline accelerating post-2017. Temperature is the dominant control (correlation ≈ -0.85), with non-linear threshold behavior near 0°C mean annual temperature.
Figure 2: Temporal Trends. Annual evolution of permafrost fraction and key climate variables.
Figure 3: Feature Correlations. Correlation matrix highlighting the dominant role of temperature and non-linearities near critical thresholds.
Modeling Framework
Stacked Ensemble Architecture
The predictive core is a stacked ensemble comprising Random Forest (RF), Histogram Gradient Boosting (HGB), and Elastic Net (EN) regression. RF captures robust non-linear spatial patterns, HGB efficiently models large-scale non-linearities and categorical features, and EN provides interpretable linear baselines. The stacking meta-learner (ridge regression) combines out-of-fold predictions from spatially stratified five-fold cross-validation, ensuring no spatial or temporal leakage.
A stratified sample of 200,000 observations is used for out-of-fold training, balancing computational tractability and statistical representativeness. Final models are refit on the full training set for deployment.
Figure 4: Ensemble Analysis. Comparative performance of base learners and the stacked ensemble.
Feature Engineering
Thirty-eight features are engineered, including:
- Temporal lags and trends (e.g., lagged temperature, permafrost trend)
- Physics-informed indicators (e.g., above-freezing, risk temperature thresholds)
- Energy balance proxies (e.g., shortwave radiation, wind shear)
- Spatial coordinates and normalized gradients
These features encode domain knowledge, capture threshold effects, and enable the model to learn both instantaneous and lagged climate-permafrost relationships.
Figure 5: Feature Importance. Relative contributions of engineered features to model predictions.
RF achieves R2=0.980 (RMSE = 5.01 pp), HGB R2=0.976 (RMSE = 5.52 pp), and EN R2=0.827 (RMSE = 14.88 pp). The ensemble outperforms individual models, particularly in transitional and extrapolative regimes. Notably, rigorous spatiotemporal cross-validation reveals that naive random splits would have substantially inflated performance metrics.
Hybrid Physics-ML Approach for Scenario Projections
To address the extrapolation limitations of ML under future climate scenarios, a hybrid approach is adopted: 60% of the prediction is derived from the ML ensemble, while 40% is contributed by a physically based permafrost sensitivity model (−10 pp/°C). This constrains predictions to physically plausible responses, especially near critical temperature thresholds.
Three Representative Concentration Pathways (RCPs) are considered:
Infrastructure Risk Assessment
A quantile-based risk classification is implemented, integrating projected permafrost decline, baseline fraction, temperature proximity to thresholds, and model uncertainty. Locations are classified as low (60%), medium (25%), or high risk (15%). High-risk zones are concentrated between 59.5° and 62.5°N, corresponding to the southern permafrost margin and the zero-degree isotherm.
Figure 7: Risk Distribution. Distribution of risk classes across the paper domain.
Figure 8: Spacial Risk Maps. Spatial distribution of high, medium, and low risk zones for infrastructure.
Uncertainty Quantification
Uncertainty is quantified via the standard deviation of ensemble predictions. The highest uncertainty is observed near the zero-change boundary and in regions with complex terrain or sparse data. Median uncertainty is 4–5 pp, with the 95th percentile at 14–15 pp. RCP8.5 exhibits systematically higher uncertainty due to greater extrapolation.
Figure 9: Uncertainty Analysis. Relationship between projected change and ensemble prediction uncertainty.
Implications and Future Directions
This framework delivers the first operational, open-source, hybrid physics-ML system for pan-Arctic permafrost risk assessment at infrastructure-relevant scales. The approach is generalizable to other permafrost regions and Earth system applications requiring robust extrapolation beyond historical data. The explicit uncertainty quantification and scenario analysis enable risk-based engineering and adaptation planning.
Key implications include:
- Operational Decision Support: The system provides actionable, uncertainty-aware risk maps for infrastructure managers and policymakers.
- Methodological Rigor: Demonstrates that rigorous spatiotemporal validation is essential for honest generalization, with naive splits leading to overconfident metrics.
- Hybrid Modeling: Shows that combining ML with physical constraints is necessary for reliable climate change projections, especially in threshold-dominated systems.
- Scalability: The framework is computationally efficient and can be extended to incorporate new data sources, process-based models, and finer temporal forecasting.
Future work should integrate regionalized climate model projections, process-based permafrost models, and explicit representation of feedbacks and delayed responses. Ensemble-based climate uncertainty quantification and extension to year-by-year projections would further enhance operational utility.
Conclusion
The presented hybrid physics-ML framework establishes a new standard for permafrost risk modeling, combining unprecedented data scale, methodological rigor, and operational relevance. By bridging the gap between data-driven and physically based approaches, it enables robust, uncertainty-aware infrastructure risk assessment under climate change, with immediate applicability to Arctic adaptation planning and broader Earth system modeling challenges.