- The paper introduces AirPCM, a causality-aware model that integrates multi-station spatial and temporal dependencies for robust multi-pollutant forecasting.
- It employs a four-stage pipeline incorporating spatial CNNs, graph attention, and causal temporal attention to explicitly model meteorological impacts.
- Extensive evaluations on diverse datasets demonstrate improved accuracy and resilience to sudden changes compared to existing forecasting methods.
Causality-Aware Spatiotemporal Modeling for Multi-Region, Multi-Pollutant Air Quality Forecasting
Introduction and Motivation
The paper introduces AirPCM, a deep spatiotemporal model designed to address the limitations of existing air quality forecasting approaches, which are typically constrained to single-pollutant and single-region paradigms. AirPCM is motivated by the need for scalable, interpretable, and robust forecasting across globally distributed monitoring stations, accounting for complex multi-pollutant interactions, evolving meteorological conditions, and spatial heterogeneity. The model is positioned to support both short-term and long-term forecasting, with explicit modeling of meteorology-pollutant causality, thereby enabling actionable insights for environmental governance and carbon mitigation.
Model Architecture and Methodological Innovations
AirPCM is structured as a four-stage pipeline:
- Multi-Station Spatial Correlation Modeling (MSCM): Utilizes a geospatial graph constructed from station coordinates, with local spatial dependencies captured via CNNs and global dependencies via GAT and multi-head self-attention. This enables efficient aggregation of spatial information and pollutant propagation across stations.
- Patching and Embedding (P&E): Segments historical pollutant time series into overlapping temporal patches, embedding them with positional and temporal encodings to facilitate downstream modeling of multi-scale temporal dependencies.
- Meteorology-Pollutant Temporal Causality Modeling (MPTC): Employs multi-head causal attention with lower-triangular masks to enforce temporal directionality, allowing each pollutant patch to selectively attend to relevant historical meteorological features within a causal window. This module explicitly models time-lagged meteorological effects on pollutant dynamics.
- Decoding (DECO): Fuses causally attended outputs with original patch embeddings, followed by stacked temporal self-attention blocks and pollutant-specific adapters, yielding multi-horizon forecasts for all target pollutants.
Figure 1: AirPCM architecture for multi-region, multi-pollutant forecasting, integrating spatial, temporal, and causal dependencies.
The model jointly encodes cross-station spatial correlations, temporal auto-correlations, and meteorology-pollutant causality, providing a unified framework for fine-grained, multi-horizon air quality prediction.
Datasets and Experimental Setup
AirPCM is evaluated on four benchmark datasets spanning urban, national, and global scales:
The evaluation employs MAE, RMSE, and SMAPE metrics, with baselines including statistical models (HA, VAR), differential equation networks (ODE-RNN, Latent-ODE, ODE-LSTM), spatiotemporal deep learning models (DCRNN, STGCN, ASTGCN, GTS, MTSF-DG, PM2.5​-GNN, Airformer), and physics-guided models (AirPhyNet, Air-DualODE).
Empirical Results
AirPCM consistently surpasses all baselines in both regular and sudden change scenarios. On the Beijing dataset, AirPCM achieves the lowest RMSE (61.33) and SMAPE (0.73) for 3-day PM2.5​ prediction, with significant improvements under sudden changes (MAE reduction of 6.9%, RMSE reduction of 6.6%, and SMAPE reduction of 12.3% over Air-DualODE). Similar gains are observed on the KnowAir dataset, with up to 12.2% improvement in SMAPE.
Figure 3: PM2.5​ concentration forecasting in Beijing, demonstrating AirPCM's superior alignment with observed values.
Figure 4: AirPCM forecasting performance on AirPCM-h, showing high accuracy and close correspondence with actual pollutant concentrations.
Causality Analysis
AirPCM's MPTC module enables interpretable quantification of meteorology-pollutant causal effects. For example, temperature exhibits a pronounced positive causal effect on O3​ concentrations, with city-specific time-delay patterns. Humidity and wind speed generally exert negative effects on particulate matter, with distinct temporal signatures across Beijing and New York.
Figure 5: Visualization of meteorology-pollutant temporal causality, revealing time-lagged and city-specific effect signatures.
Generalization and Robustness
AirPCM demonstrates strong generalization in cross-temporal and cross-regional transfer settings, outperforming baselines when models trained on Beijing or KnowAir are evaluated on AirPCM-h. Under sudden changes, AirPCM achieves lower prediction errors, maintaining sharper tracking of pollution transitions compared to lagging or smoothed responses from other models.
Figure 6: Generalization and robustness evaluation, highlighting AirPCM's adaptability and resilience to sudden air quality changes.
Long-Term Forecasting and Case Study
A case paper using AirPCM-d reveals persistent north-south gradients in AQI across China, with northern regions experiencing more unhealthy days due to meteorological and topographical factors. Major cities show improved air quality over time, but secondary pollutants like O3​ exhibit upward trends, reflecting the complex interplay of emission controls, precursor ratios, and climate effects.
Figure 7: Long-term AQI and pollutant concentration trends in China, illustrating spatial disparities and evolving pollutant profiles.
Theoretical and Practical Implications
AirPCM advances the state of air quality forecasting by integrating multi-region, multi-pollutant, and meteorological dynamics within a causality-aware deep learning framework. The explicit modeling of spatiotemporal and causal dependencies enables robust, interpretable, and transferable predictions, supporting proactive public health interventions and adaptive environmental policy. The model's ability to generalize across regions and time periods, and to handle abrupt pollution episodes, addresses critical gaps in existing approaches.
Figure 8: Overview of spatiotemporal-causal relationship modeling in AirPCM, encompassing spatial, temporal, and causal dependencies.
The findings underscore the necessity of multi-pollutant, causality-informed forecasting systems, particularly as urbanization and climate variability intensify. The observed divergence between primary and secondary pollutant trends highlights the importance of joint modeling for diagnostic and mitigation strategies.
Conclusion
AirPCM represents a significant methodological advancement in air quality forecasting, offering a unified, causality-aware spatiotemporal model capable of multi-region, multi-pollutant prediction. The empirical results demonstrate strong accuracy, generalization, and robustness, with interpretable causal insights into meteorology-driven pollution dynamics. Future research should focus on further improving the precision of abrupt episode prediction, extending the framework to additional environmental variables, and integrating domain knowledge for enhanced interpretability and policy relevance.