TrajGRU: Dynamic Nowcasting Model

Updated 26 February 2026

TrajGRU is a deep learning model that dynamically learns sparse, location-variant trajectories to capture complex motion patterns in spatiotemporal data.
The model employs an encoding–forecasting framework with a structure generator that predicts offsets for differentiable bilinear warping of hidden states.
TrajGRU demonstrates superior performance over conventional ConvGRU by achieving a better trade-off between parameter efficiency and nowcasting skill through balanced loss training and online adaptation.

The Trajectory Gated Recurrent Unit (TrajGRU) is a deep recurrent neural network architecture designed to address location-variant spatiotemporal process modeling, with its most prominent application to high-resolution precipitation nowcasting using radar echo maps. TrajGRU extends the convolutional Gated Recurrent Unit (ConvGRU) by dynamically learning sparse, location-dependent recurrent connections—"trajectories"—thereby enabling the network to better represent complex motion patterns such as rotation and scaling, which are ubiquitous in meteorological data but not well-captured by location-invariant convolutional recurrence.

1. Problem Formulation and Model Context

Precipitation nowcasting, the short-range prediction of rainfall intensity based on recent radar echoes, is formulated as a spatiotemporal sequence prediction problem. A radar volume scan is represented as a sequence $\{\mathcal I_t\}$ of CAPPI echo maps. Given the $J$ most recent frames, $\mathcal I_{t-J+1},\dots,\mathcal I_t$ , the task is to predict the subsequent $K$ frames, $\hat{\mathcal I}_{t+1},\dots,\hat{\mathcal I}_{t+K}$ . TrajGRU is implemented within an encoding–forecasting framework comprising $n$ stacked recurrent layers (RNNs):

Encoder: Processes the $J$ input frames to produce a hierarchy of hidden states, $\mathcal H_t^1,\dots,\mathcal H_t^n$ , at different spatial scales enabled by downsampling and upsampling operations.
Forecaster: Unfolds these hidden states temporally to generate the future frames.

This architecture generalizes previous models such as ConvLSTM and ConvGRU, which perform state-to-state updates via location-invariant convolutions, a limitation for capturing natural motion present in atmospheric sequences (Shi et al., 2017).

2. ConvGRU Recurrence and Limitations

In the ConvGRU model, the evolution of the hidden state $\mathcal H_t$ is determined by 2D convolutional operations applied identically at all spatial locations. The gates and update equations are:

$\begin{aligned} \mathcal Z_t &= \sigma\bigl(\mathcal W_{xz}\ast\mathcal X_t + \mathcal W_{hz}\ast\mathcal H_{t-1}\bigr),\ \mathcal R_t &= \sigma\bigl(\mathcal W_{xr}\ast\mathcal X_t + \mathcal W_{hr}\ast\mathcal H_{t-1}\bigr),\ \mathcal H'_t&= f\bigl(\mathcal W_{xh}\ast\mathcal X_t + \mathcal R_t\circ(\mathcal W_{hh}\ast\mathcal H_{t-1})\bigr),\ \mathcal H_t &= (1-\mathcal Z_t)\circ\mathcal H'_t + \mathcal Z_t\circ\mathcal H_{t-1}, \end{aligned}$

where $\ast$ denotes 2D convolution, $\circ$ is element-wise multiplication, $\sigma$ is the sigmoid function, and $f$ is typically leaky ReLU. Both input–hidden and hidden–hidden transformations are spatially invariant. This design prevents the model from adapting its connections to local motions and deformations present in phenomena such as precipitation (Shi et al., 2017).

3. TrajGRU Architecture: Location-Variant Recurrence

TrajGRU generalizes ConvGRU by replacing the fixed hidden–hidden convolution $\mathcal W_{hh}\ast\mathcal H_{t-1}$ with a sparse, learned sampling of $\mathcal H_{t-1}$ at dynamically predicted offsets—trajectories—specific to each spatial position and timestep. The structure generator $\gamma$ (a lightweight two-layer convolutional network) computes $L$ offset fields at each location:

$(\mathcal U_t,\mathcal V_t) = \gamma(\mathcal X_t,\,\mathcal H_{t-1}),$

with $\mathcal U_t, \mathcal V_t \in \mathbb{R}^{L \times H \times W}$ . For each trajectory $l \in \{1,\dots,L\}$ , the previous state is sampled at offset locations via differentiable bilinear warping:

$\mathrm{warp}(\mathcal H_{t-1},\,\mathcal U_{t,l},\,\mathcal V_{t,l}) = \mathcal H_{t-1}(x+\mathcal V_{t,l},\,y+\mathcal U_{t,l}).$

The updated gates and candidate state are then constructed as follows:

$\begin{aligned} \mathcal Z_t &= \sigma\left(\mathcal W_{xz}\ast\mathcal X_t + \sum_{l=1}^{L}\mathcal W_{hz}^l \ast \mathrm{warp}(\mathcal H_{t-1}, \mathcal U_{t,l}, \mathcal V_{t,l})\right),\ \mathcal R_t &= \sigma\left(\mathcal W_{xr}\ast\mathcal X_t + \sum_{l=1}^{L}\mathcal W_{hr}^l \ast \mathrm{warp}(\mathcal H_{t-1}, \mathcal U_{t,l}, \mathcal V_{t,l})\right),\ \mathcal H'_t &= f\left(\mathcal W_{xh}\ast\mathcal X_t + \mathcal R_t \circ \left(\sum_{l=1}^{L}\mathcal W_{hh}^l \ast \mathrm{warp}(\mathcal H_{t-1}, \mathcal U_{t,l}, \mathcal V_{t,l})\right)\right),\ \mathcal H_t &= (1-\mathcal Z_t)\circ\mathcal H'_t + \mathcal Z_t\circ\mathcal H_{t-1}. \end{aligned}$

Key architectural elements include the use of $1 \times 1$ convolutions for all $\mathcal W$ parameters associated to each trajectory, with $L$ typically much smaller than a full $K\times K$ ConvGRU kernel, and the structure generator $\gamma$ parameterized as a two-layer convolutional network (e.g., 32 channels, $5\times5$ kernels). This approach enables TrajGRU to adaptively align its recurrent connectivity structure with location-variant scene dynamics (Shi et al., 2017).

4. Training Loss and Objective Function

Effective training of precipitation nowcasting models requires accounting for the inherent class imbalance in rainfall rates. Light rainfall is frequent, but heavy rainfall, though rarer, is more critical operationally. TrajGRU introduces a pixel-wise, value-dependent weighting scheme:

$w(x) = \begin{cases} 1 & x < 2, \ 2 & 2 \le x < 5, \ 5 & 5 \le x < 10, \ 10 & 10 \le x < 30, \ 30 & x \ge 30, \end{cases}$

where $x$ is rain rate in mm/h; pixels masked due to missing data or identified as outliers are assigned $w=0$ . The loss over $N$ frames of size $H\times W$ averages two balanced regression objectives: $\mathrm{B\text{-}MSE} = \frac{1}{N}\sum_{n,i,j} w_{n,i,j} (x_{n,i,j}-\hat x_{n,i,j})^2, \qquad \mathrm{B\text{-}MAE} = \frac{1}{N}\sum_{n,i,j} w_{n,i,j} \lvert x_{n,i,j}-\hat x_{n,i,j}\rvert.$ The final offline loss is $\mathrm{Loss} = \mathrm{B\text{-}MSE} + \mathrm{B\text{-}MAE}$ . For online adaptation (incremental fine-tuning), the same objective is used, optimized via, e.g., AdaGrad (Shi et al., 2017).

5. HKO-7 Radar Benchmark and Preprocessing

The evaluation of TrajGRU and baselines employs the HKO-7 benchmark, constructed from Hong Kong Observatory CAPPI reflectivity (dBZ) radar mosaics at 2 km altitude, spatial coverage 512 km $\times$ 512 km (480 $\times$ 480 pixels), every 6 minutes. The dataset spans 2009–2015 with the following splits:

Subset	Years	Days	Frames
Training	2009–2014	812	~192,168
Validation	2009–2014	50	~11,736
Testing	2015	131	~31,350

Preprocessing involves outlier masking (e.g., ground clutter, sun spikes) using Mahalanobis distance on per-pixel reflectivity histograms, then discarding reflectivity values outside $[1,70]$ dBZ. Rain rates $R$ (mm/h) are mapped from dBZ via the Marshall-Palmer relationship: $\mathrm{dBZ} = 10\log_{10}(58.53) + 1.56\log_{10}(R).$ (Shi et al., 2017)

6. Evaluation Protocol, Metrics, and Benchmarks

Evaluation is conducted under two scenarios:

Offline: Each test sequence comprises the latest 5 input frames ( $J=5$ ), with the model asked to predict the next 20 frames ( $K=20$ ) with no adaptation.
Online: The system receives consecutive 5-frame segments; after each, fine-tuning on a buffer of recent frames (e.g., 25) is permitted prior to making the next prediction.

The protocol employs both continuous and categorical metrics:

B-MSE, B-MAE: Balanced mean squared/absolute error as described above.
Categorical skill scores: At rainfall thresholds $\tau\in\{0.5,2,5,10,30\}$ $τ \in {0.5, 2, 5, 10, 30}$ mm/h, binarization at each pixel yields counts of TP, FP, FN, and TN. Metrics include:
- Critical Success Index (CSI): $\displaystyle \mathrm{CSI} = \frac{\mathrm{TP}}{\mathrm{TP}+\mathrm{FN}+\mathrm{FP}}$
- Heidke Skill Score (HSS): $\displaystyle \mathrm{HSS} = \frac{\mathrm{TP}\,\mathrm{TN} - \mathrm{FP}\,\mathrm{FN}} {(\mathrm{TP}+\mathrm{FN})(\mathrm{FN}+\mathrm{TN}) + (\mathrm{TP}+\mathrm{FP})(\mathrm{FP}+\mathrm{TN})}$

The suite comprises persistence (last observation) baseline, two optical-flow methods (ROVER, linear and nonlinear), 2D and 3D CNNs, ConvGRU (with and without balanced losses), and TrajGRU.

Key empirical findings include:

Usage of balanced losses is essential to achieve high skill at infrequent but operationally important heavy-rain thresholds (10/30 mm/h); ConvGRU trained with standard MSE/MAE can underperform optical-flow-based baselines in these cases.
All deep models trained with balanced losses surpass optical-flow baselines across metrics.
TrajGRU attains the best trade-off between parameter efficiency and skill, with statistically significant gains over ConvGRU.
Online adaptation (fine-tuning) consistently enhances both CSI and HSS at all thresholds. (Shi et al., 2017)

7. Significance and Implications

TrajGRU provides an explicit mechanism for adaptively warping hidden states, thereby capturing complex, location-variant dynamics in meteorological data—capabilities inaccessible to previous location-invariant recurrent architectures. Its demonstration on the HKO-7 benchmark establishes both a state-of-the-art approach for radar-based precipitation nowcasting and a standardized evaluation pipeline, including dataset, loss formulation, and metrics, that form a foundation for subsequent research in deep learning-based spatiotemporal prediction. The experimental evidence substantiates the necessity of both location-variant model components and balanced training objectives for optimal skill in high-impact, imbalanced forecasting tasks (Shi et al., 2017).

Markdown Report Issue Upgrade to Chat

References (1)

Deep Learning for Precipitation Nowcasting: A Benchmark and A New Model (2017)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Trajectory GRU Model (TrajGRU).