A Causality-Aware Spatiotemporal Model for Multi-Region and Multi-Pollutant Air Quality Forecasting (2509.21260v1)

Published 25 Sep 2025 in cs.LG and cs.AI

Abstract: Air pollution, a pressing global problem, threatens public health, environmental sustainability, and climate stability. Achieving accurate and scalable forecasting across spatially distributed monitoring stations is challenging due to intricate multi-pollutant interactions, evolving meteorological conditions, and region specific spatial heterogeneity. To address this challenge, we propose AirPCM, a novel deep spatiotemporal forecasting model that integrates multi-region, multi-pollutant dynamics with explicit meteorology-pollutant causality modeling. Unlike existing methods limited to single pollutants or localized regions, AirPCM employs a unified architecture to jointly capture cross-station spatial correlations, temporal auto-correlations, and meteorology-pollutant dynamic causality. This empowers fine-grained, interpretable multi-pollutant forecasting across varying geographic and temporal scales, including sudden pollution episodes. Extensive evaluations on multi-scale real-world datasets demonstrate that AirPCM consistently surpasses state-of-the-art baselines in both predictive accuracy and generalization capability. Moreover, the long-term forecasting capability of AirPCM provides actionable insights into future air quality trends and potential high-risk windows, offering timely support for evidence-based environmental governance and carbon mitigation planning.

Summary

The paper introduces AirPCM, a causality-aware model that integrates multi-station spatial and temporal dependencies for robust multi-pollutant forecasting.
It employs a four-stage pipeline incorporating spatial CNNs, graph attention, and causal temporal attention to explicitly model meteorological impacts.
Extensive evaluations on diverse datasets demonstrate improved accuracy and resilience to sudden changes compared to existing forecasting methods.

Causality-Aware Spatiotemporal Modeling for Multi-Region, Multi-Pollutant Air Quality Forecasting

Introduction and Motivation

The paper introduces AirPCM, a deep spatiotemporal model designed to address the limitations of existing air quality forecasting approaches, which are typically constrained to single-pollutant and single-region paradigms. AirPCM is motivated by the need for scalable, interpretable, and robust forecasting across globally distributed monitoring stations, accounting for complex multi-pollutant interactions, evolving meteorological conditions, and spatial heterogeneity. The model is positioned to support both short-term and long-term forecasting, with explicit modeling of meteorology-pollutant causality, thereby enabling actionable insights for environmental governance and carbon mitigation.

Model Architecture and Methodological Innovations

AirPCM is structured as a four-stage pipeline:

Multi-Station Spatial Correlation Modeling (MSCM): Utilizes a geospatial graph constructed from station coordinates, with local spatial dependencies captured via CNNs and global dependencies via GAT and multi-head self-attention. This enables efficient aggregation of spatial information and pollutant propagation across stations.
Patching and Embedding (P&E): Segments historical pollutant time series into overlapping temporal patches, embedding them with positional and temporal encodings to facilitate downstream modeling of multi-scale temporal dependencies.
Meteorology-Pollutant Temporal Causality Modeling (MPTC): Employs multi-head causal attention with lower-triangular masks to enforce temporal directionality, allowing each pollutant patch to selectively attend to relevant historical meteorological features within a causal window. This module explicitly models time-lagged meteorological effects on pollutant dynamics.
Decoding (DECO): Fuses causally attended outputs with original patch embeddings, followed by stacked temporal self-attention blocks and pollutant-specific adapters, yielding multi-horizon forecasts for all target pollutants.
Figure 1: AirPCM architecture for multi-region, multi-pollutant forecasting, integrating spatial, temporal, and causal dependencies.

The model jointly encodes cross-station spatial correlations, temporal auto-correlations, and meteorology-pollutant causality, providing a unified framework for fine-grained, multi-horizon air quality prediction.

Datasets and Experimental Setup

AirPCM is evaluated on four benchmark datasets spanning urban, national, and global scales:

Beijing Dataset: Hourly measurements from 35 stations in Beijing (2017–2018).
KnowAir Dataset: PM $_{2.5}$ and meteorological data from 184 Chinese cities (2015–2018).
AirPCM-d: Daily records from 156 stations across China (2015–2025).
AirPCM-h: Hourly data from 453 stations in Europe, the US, and China (2024–2025).
Figure 2: Spatial distributions of monitoring stations across urban, national, and global datasets.

The evaluation employs MAE, RMSE, and SMAPE metrics, with baselines including statistical models (HA, VAR), differential equation networks (ODE-RNN, Latent-ODE, ODE-LSTM), spatiotemporal deep learning models (DCRNN, STGCN, ASTGCN, GTS, MTSF-DG, PM $_{2.5}$ -GNN, Airformer), and physics-guided models (AirPhyNet, Air-DualODE).

Empirical Results

Predictive Performance

AirPCM consistently surpasses all baselines in both regular and sudden change scenarios. On the Beijing dataset, AirPCM achieves the lowest RMSE (61.33) and SMAPE (0.73) for 3-day PM $_{2.5}$ prediction, with significant improvements under sudden changes (MAE reduction of 6.9%, RMSE reduction of 6.6%, and SMAPE reduction of 12.3% over Air-DualODE). Similar gains are observed on the KnowAir dataset, with up to 12.2% improvement in SMAPE.

Figure 3: PM $_{2.5}$ concentration forecasting in Beijing, demonstrating AirPCM's superior alignment with observed values.

Figure 4: AirPCM forecasting performance on AirPCM-h, showing high accuracy and close correspondence with actual pollutant concentrations.

Causality Analysis

AirPCM's MPTC module enables interpretable quantification of meteorology-pollutant causal effects. For example, temperature exhibits a pronounced positive causal effect on O $_3$ concentrations, with city-specific time-delay patterns. Humidity and wind speed generally exert negative effects on particulate matter, with distinct temporal signatures across Beijing and New York.

Figure 5: Visualization of meteorology-pollutant temporal causality, revealing time-lagged and city-specific effect signatures.

Generalization and Robustness

AirPCM demonstrates strong generalization in cross-temporal and cross-regional transfer settings, outperforming baselines when models trained on Beijing or KnowAir are evaluated on AirPCM-h. Under sudden changes, AirPCM achieves lower prediction errors, maintaining sharper tracking of pollution transitions compared to lagging or smoothed responses from other models.

Figure 6: Generalization and robustness evaluation, highlighting AirPCM's adaptability and resilience to sudden air quality changes.

Long-Term Forecasting and Case Study

A case paper using AirPCM-d reveals persistent north-south gradients in AQI across China, with northern regions experiencing more unhealthy days due to meteorological and topographical factors. Major cities show improved air quality over time, but secondary pollutants like O $_3$ exhibit upward trends, reflecting the complex interplay of emission controls, precursor ratios, and climate effects.

Figure 7: Long-term AQI and pollutant concentration trends in China, illustrating spatial disparities and evolving pollutant profiles.

Theoretical and Practical Implications

AirPCM advances the state of air quality forecasting by integrating multi-region, multi-pollutant, and meteorological dynamics within a causality-aware deep learning framework. The explicit modeling of spatiotemporal and causal dependencies enables robust, interpretable, and transferable predictions, supporting proactive public health interventions and adaptive environmental policy. The model's ability to generalize across regions and time periods, and to handle abrupt pollution episodes, addresses critical gaps in existing approaches.

Figure 8: Overview of spatiotemporal-causal relationship modeling in AirPCM, encompassing spatial, temporal, and causal dependencies.

The findings underscore the necessity of multi-pollutant, causality-informed forecasting systems, particularly as urbanization and climate variability intensify. The observed divergence between primary and secondary pollutant trends highlights the importance of joint modeling for diagnostic and mitigation strategies.

Conclusion

AirPCM represents a significant methodological advancement in air quality forecasting, offering a unified, causality-aware spatiotemporal model capable of multi-region, multi-pollutant prediction. The empirical results demonstrate strong accuracy, generalization, and robustness, with interpretable causal insights into meteorology-driven pollution dynamics. Future research should focus on further improving the precision of abrupt episode prediction, extending the framework to additional environmental variables, and integrating domain knowledge for enhanced interpretability and policy relevance.