NorthEast Monsoon Climate Index

Updated 22 January 2026

The NorthEast Monsoon Climate Index is a regional climate indicator that quantifies boreal winter monsoon influence using normalized SST anomaly contrasts over defined oceanic regions.
It employs a Deep Q-Network to optimize the selection of SST regions, thereby enhancing the statistical association with seasonal rainfall patterns in Thailand.
Integrating the NE-Index into LSTM forecasting models significantly reduces 12-month RMSE, demonstrating robust improvements in long-range rainfall predictions.

The NorthEast Monsoon Climate Index (NE-Index) is a regional-scale climate indicator developed to quantify the influence of the boreal winter (Northeast) monsoon on seasonal rainfall in Thailand, explicitly targeting improved predictability at local to subregional scales. Constructed from optimally selected sea surface temperature (SST) anomaly contrasts over the South China Sea and adjacent oceans, the NE-Index leverages reinforcement learning via a Deep Q-Network (DQN) to maximize statistical association with regional monsoonal rainfall. Incorporation of the NE-Index as a predictor in sequence-to-sequence Long Short-Term Memory (LSTM) models yields systematic improvements in twelve-month rainfall forecast skill for clusters of rain gauge stations across Thailand, particularly in regions dominated by Northeast monsoon influence (Chobtham, 15 Jan 2026).

1. Formal Definition and Computation

The NE-Index is defined as a normalized difference between mean SST anomalies over two optimally chosen rectangular oceanic regions, denoted A and B. For grid point $(i,j)$ and time $t$ ,

Let $SST_{i,j}(t)$ denote observed SST and $\overline{SST}_{i,j}$ the multi-decadal climatological mean (here, 1982–2024).
The SST anomaly is $SST'_{i,j}(t) = SST_{i,j}(t) - \overline{SST}_{i,j}$ .
For rectangle $R$ (A or B) with $|R|$ grid cells, the regional mean anomaly:

$SST_R(t) = \frac{1}{|R|} \sum_{(i,j)\in R} SST'_{i,j}(t)$

The candidate NE-Index is:

$Z(t) = \mathrm{normalize}[SST_B(t) - SST_A(t)]$

where normalization is by removal of the time mean and division by its standard deviation, so $Z$ has zero mean and unit variance over the analysis period (Chobtham, 15 Jan 2026).

2. Reinforcement Learning Optimization via Deep Q-Network

The selection of regions A and B proceeds via a discrete-space optimization, driven by the reward of strong correlation between $Z(t)$ and seasonal rainfall aggregates. The DQN framework:

State $s$ : Encodes the latitude-longitude bounds of A and B.
Action $a$ : In shift-only configuration, eight actions permit shifting A or B by $\pm0.5^\circ$ latitude/longitude. Shift-and-resize mode extends to 16 actions with boundary expansions/contractions, constrained to non-zero dimensions.
Reward $Q(A,B)$ :
- $Y_{NE\,onset}(t)$ : average rainfall at southern Thailand stations (Oct–Mar).
- $Y_{NE\,retreat}(t)$ : average rainfall at upper Thailand stations (Apr–Sep).
- Compute $R_\text{onset} = \mathrm{corr}(Z, Y_{NE\,onset})$ over Oct–Mar; $R_\text{retreat} = \mathrm{corr}(Z, Y_{NE\,retreat})$ over Apr–Sep.

$Q(A,B) = R_\text{onset} + R_\text{retreat}$

DQN Details: The Deep Q-Network uses a feedforward architecture (two hidden layers with ReLU, output size $|a|$ ), discount factor $\gamma=0.99$ , $\epsilon$ -greedy exploration decayed from 1.0 to 0.1, over 100,000 training steps. Convergence is monitored via plateauing running average reward (Chobtham, 15 Jan 2026).

3. Spatial Domain and Rectangle Search

The optimization is performed over SST fields gridded at $0.5^\circ \times 0.5^\circ$ , bounded by $2^\circ$ S– $25^\circ$ N and $100^\circ$ E– $130^\circ$ E. Baseline rectangles (from literature) are:

Area A: $10$– $16.25^\circ$ N, $110$– $118.75^\circ$ E
Area B: $3.75$– $6.25^\circ$ N, $103.75$– $106.25^\circ$ E

The DQN efficiently discovers rectangle adjustments that maximize the season-aware objective $Q(A,B)$ , attaining $Q\approx0.497$ (shift-only) versus the baseline value $Q\approx0.052$ . Shift-and-resize yields $Q\approx0.412$ , demonstrating substantial skill improvement from data-driven region selection (Chobtham, 15 Jan 2026).

Configuration	Best Q Value	Baseline Q Value
Shift-only	0.497	0.052
Shift-and-resize	0.412	0.052

4. Clustering of Rainfall Stations and Correlation Analysis

Rainfall station data from 384 sites undergo hierarchical agglomerative clustering to identify coherent hydroclimate regimes. Features (12-month mean climatology, latitude, and longitude) are normalized and reduced via PCA before Euclidean distance-based clustering. The approach yields 12 clusters: 1–4 (southern Thailand, strongly NE-monsoon-influenced), 5–12 (upper Thailand) (Chobtham, 15 Jan 2026).

Pearson correlation between NE-Index $Z(t)$ and cluster-averaged rainfall $Y(t)$ quantifies the index’s explanatory power:

$\mathrm{corr}(Z,Y) = \frac{\sum_t [Z(t)-\overline{Z}][Y(t)-\overline{Y}]}{\sqrt{\sum_t [Z(t)-\overline{Z}]^2 \sum_t [Y(t)-\overline{Y}]^2}}$

Correlation coefficients $|R|$ up to 0.75 are observed between NE-Index and rainfall in specific clusters, with squared correlations ( $R^2$ ) directly influencing DQN optimization. This suggests robust, physically interpretable teleconnections at the optimized spatial scales (Chobtham, 15 Jan 2026).

5. Downstream Integration in LSTM Rainfall Forecasting

The NE-Index is incorporated alongside traditional large-scale climate indices (including DMI, MEI, PDO, MJO, BSISO, South-West Monsoon Index, ONI, each with $|corr| > 0.6$ to target rainfall) as input to sequence-to-sequence LSTM models. Each model uses a 24-month input window and forecasts the following 12 months of rainfall per cluster. Hyperparameters are systematically swept: hidden size $\in \{16,32,64\}$ , LSTM depth $\in \{1,2,3\}$ , dropout $\in \{0,0.2,0.5\}$ , and learning rate $0.01$. Training proceeds for up to 200 epochs with early stopping, with model selection based on minimum root mean squared error (RMSE) over held-out years (Chobtham, 15 Jan 2026).

6. Forecast Accuracy and Validation Metrics

Addition of the NE-Index to the LSTM input set consistently reduces 12-month RMSE in most clusters. Notable results (mean of two test folds):

Cluster	RMSE without NE (mm)	RMSE with NE (mm)
1	99.79	94.54
2	82.61	77.05
4	130.02	121.48
6	57.27	53.94
9	56.11	55.44
12	59.24	52.38

Relative improvements are particularly pronounced in southern clusters (1–4) and remain positive across most upper Thailand clusters, confirming the added predictive value of the optimized NE-Index beyond standard global indices (Chobtham, 15 Jan 2026).

7. Summary and Significance

The development of the NE-Index (Chobtham, 15 Jan 2026) demonstrates a successful fusion of reinforcement learning and climate science for regional-scale hydroclimate index discovery. By automatically selecting SST-based indices tailored to local seasonal rainfall variability, this method reveals physical relationships inaccessible to a priori or global-only indices. The marked reduction in forecast RMSE upon NE-Index inclusion substantiates the practical benefit for operational long-range rainfall prediction in Thailand and exemplifies a transferable methodology for monsoon regime characterization elsewhere.

Markdown Report Issue Upgrade to Chat

References (1)

Reinforcement Learning to Discover a NorthEast Monsoon Index for Monthly Rainfall Prediction in Thailand (2026)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to NorthEast Monsoon Climate Index.