Papers
Topics
Authors
Recent
2000 character limit reached

Pan-Arctic LST Dataset: AVHRR Super-Resolved Archive

Updated 29 November 2025
  • Pan-Arctic LST Dataset is a four-decade high-resolution land surface temperature archive created to monitor Arctic climate dynamics.
  • It downscales coarse 4 km AVHRR data to a 1 km grid using a deep learning–driven anisotropic diffusion model with high-res topographic and land-cover guides.
  • The dataset facilitates applications such as permafrost modeling, glacier retreat analysis, and energy balance assessments in Arctic regions.

The Pan-Arctic Land Surface Temperature (LST) Dataset—formally designated as the AVHRR Super-Resolved (SR) LST product—constitutes a four-decade, twice-daily, 1 km gridded record of land surface temperature over the entire Northern Hemisphere north of 50° N. Extending from August 1981 to December 2023, it delivers the longest and most spatially detailed satellite-based pan-Arctic LST archive compiled to date. The dataset was generated by downscaling coarse-resolution Advanced Very High Resolution Radiometer (AVHRR) Global Area Coverage (GAC) LST data via a deep learning–driven super-resolution approach, anchored by high-resolution topographic and land-cover guides. This resource addresses the need for high-resolution climate products to analyze Arctic permafrost dynamics, surface energy balance, and climate variability prior to the MODIS era (Dupuis et al., 21 Nov 2025).

1. Scope, Temporal Span, and Spatial Characteristics

The dataset spans 42 years, from August 1981 through December 2023, and provides two clear-sky LST snapshots per day, corresponding to daytime and nighttime AVHRR overpasses for each satellite. Its spatial domain encompasses the entire terrestrial Arctic north of 50° N, covering longitudes –180° to +180° and latitudes 50° to 90°.

Native AVHRR GAC data have a nominal ground sampling distance (GSD) of 0.05° (approximately 4 km). The super-resolved product (AVHRR SR LST) enhances this to 0.01° (approximately 1 km GSD), increasing spatial detail by a factor of five while preserving the native temporal (twice-daily) sampling. The following table summarizes core dataset properties:

Property Native AVHRR GAC AVHRR SR LST (super-resolved)
Spatial resolution 0.05° (~4 km) GSD 0.01° (~1 km) GSD
Temporal frequency Twice daily Twice daily
Spatial extent N of 50° N of 50°
Time period 1981-2023 1981-2023

2. Source Data and Preprocessing

Fundamental data inputs consist of AVHRR GAC Level-1b brightness temperatures from the NOAA-7/9/11/12/14/16/17/18/19 and MetOp-A/B/C satellites. LST is derived from these data using the generalized split-window (GSW) algorithm, with updated snow cover corrections (SWE/FSC v3.1/4.0). The primary limitations of the native 4 km GSD product include heterogeneous subpixel mixing, unresolved small-scale permafrost and vegetation patterns, and data gaps caused by persistent cloud cover.

To guide the super-resolution process, several high-resolution static ancillary layers—regridded to 0.01°—are employed:

  • Copernicus DEM GLO-90 from TanDEM-X (topography)
  • ESA CCI Land Cover 2005 (22 classes, upscaled via majority voting)
  • GEDI + Sentinel-2 canopy height map

These guide images encode fine-scale surface heterogeneity and serve as constraints for spatial sharpening of the LST signal.

3. Super-Resolution Methodology

The downscaling algorithm employs a two-stage approach: an anisotropic diffusion model as a mathematical basis, and a deep learning implementation termed "DADA" (Deep Anisotropic Diffusion for LST, Editor's term), with architecture derived from Metzger et al. (2023).

The core evolution equation,

∂u∂t=∇⋅(D(x)∇u),\frac{\partial u}{\partial t} = \nabla \cdot (D(x) \nabla u),

steers diffusion of the LST field u(x,t)u(x, t) according to position-dependent tensor D(x)D(x) derived from the gradient structure of the guide images. The Perona–Malik style function g(s)=1/(1+(s/κ)2)g(s) = 1/(1+(s/\kappa)^2) is employed to suppress diffusion across significant surface-type discontinuities.

The DADA deep network follows a U-Net encoder–decoder structure with a ResNet-50 backbone, skip connections, and multi-scale fusion. Input consists of the three guide channels concatenated with a coarsened LST patch (5× pooling). Output is a 1 km LST estimate. Supervision is provided by MODIS LST L3C (1 km) and IRCDR L3S (1 km), with the input coarsened via NaN-aware average pooling to simulate the AVHRR GAC resolution.

Key training parameters include 240×240-pixel patches, batch size 8, 400 random patches per epoch, Adam optimizer, learning rate stepped every 150 epochs, and approximately 150,000 total iterations (about 3,000 epochs). Data augmentation incorporates random flips and Planckian color jitter on the guide images.

Performance against alternative hyperparameters (sampler_length, learning rate step size, ResNet backbone depth) yields near-identical RMSE and MAE, demonstrating robustness of the architecture.

4. Validation and Uncertainty Assessment

Product accuracy is assessed by three principal strategies:

  1. MODIS Hold-Out Evaluation: Eight geographically independent Pan-Arctic scenes (Alps, Siberia, North America; years 2016–2018/2020) compare super-resolved LST to native MODIS (1 km). Results show MAE ≈ 1.15°C and RMSE ≈ 2.30°C across over 8 million pixels. Most residual errors cluster near water edges or sharp thermal transitions. Errors >5°C are rare and generally attributable to sensor artifacts. Homogeneous regions such as the Greenland Ice Sheet present slightly elevated errors due to lack of surface-type variation in the guide.
  2. In-Situ Cross-Validation: Comparisons against 16 ground stations (SURFRAD, KIT, ARM NSA, BSRN, LAW) spanning cropland, shrubland, tundra, and ice yield day/night median deviation ≈ 0°C, RMSE ~2–3°C, and a robust standard deviation of ~1.5°C.
  3. External Intercomparison: Comparison with LSA SAF EDLST (0.01° MetOp series) for 2020 daily composites over four Pan-Arctic subregions show differences centered near 0°C, with slight regional biases (<1.2°C). Discrepancies are ascribed to compositing protocol and masking differences.

Algorithm performance compared to bicubic and source-resolution baselines is summarized:

Algorithm MAE (°C) RMSE (°C)
DADA (default, ResNet-50) 1.150 2.297
Bicubic baseline 1.244 2.396
Coarse input ("source") 1.283 2.440

5. Access, Data formatting, and Tools

The dataset is distributed as NetCDF-4 files with CF-1.6 conventions, referenced to a WGS84 lat–lon grid (–180° to +180°, 50° to 90°). Each file contains the super-resolved LST (0.01° GSD, Kelvin units, scaled by 0.01 and offset 273.15), the native GAC LST (0.05° GSD, same scaling), and auxiliary variables including scanline time, satellite and solar zenith angles, and quality flags (test_mae, r2).

File organization follows a year/satellite directory structure (e.g., for NOAA-18), with 730 files per year (732 for leap years). Filenames encode timestamp, satellite, overpass type (DAY/NIGHT), and version.

Access is provided via:

Recommended tools for interaction include Python xarray, netCDF4, rioxarray, and TorchGeo.

6. Scientific Applications and Advantages

The AVHRR SR LST dataset is optimized for:

  • Permafrost modeling: Fine-scale (1 km) LST resolves subpixel vegetation, topographic heterogeneity, and small water bodies, enabling improved simulation of talik development and thaw-settlement processes.
  • Near-surface air temperature reconstruction: Finer LST fields enhance LST–T2M regression performance at landscape scale.
  • Assessment of Greenland Ice Sheet surface mass balance: High-resolution LST is suitable for forcing ablation and SMB energy-balance models, resolving narrow ablation zones.
  • Glacier retreat, ecosystem thermal stress, and winter warming analysis: Enhanced detection of small lakes (<4 km), landscape mosaics, and rapid thermal gradients reduces smoothing artifacts in derived energy fluxes.

A critical advantage is the capacity to fill the observational gap for high-resolution LST in the pre-MODIS (before 2000) era, providing a uniform long-term resource for Arctic climate monitoring.

7. Prospects for Future Use and Development

The AVHRR SR LST methodological framework is designed for adaptability to new thermal infrared (TIR) missions—such as SBG and TRISHNA—by retraining on future LST–guide co-datasets. This infrastructure secures continuity and consistency in the pan-Arctic climate data record, regardless of sensor platform.

Anticipated enhancements include dynamic, time-variable guide layers (e.g., updated land cover, seasonal snow), which could further refine the accuracy and responsiveness of the super-resolution network to surface changes.

For full technical details, code, and visualizations, reference Dupuis et al. (2025) and the project's GitHub repository (github.com/soniajdupuis/Enhanced_pan_Arctic_LST) (Dupuis et al., 21 Nov 2025).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Whiteboard

Follow Topic

Get notified by email when new papers are published related to Pan-Arctic LST Dataset.