Short-term forecasting COVID-19 cumulative confirmed cases: Perspectives for Brazil (2007.12261v1)

Published 21 Jul 2020 in q-bio.PE and cs.LG

Abstract: The new Coronavirus (COVID-19) is an emerging disease responsible for infecting millions of people since the first notification until nowadays. Developing efficient short-term forecasting models allow knowing the number of future cases. In this context, it is possible to develop strategic planning in the public health system to avoid deaths. In this paper, autoregressive integrated moving average (ARIMA), cubist (CUBIST), random forest (RF), ridge regression (RIDGE), support vector regression (SVR), and stacking-ensemble learning are evaluated in the task of time series forecasting with one, three, and six-days ahead the COVID-19 cumulative confirmed cases in ten Brazilian states with a high daily incidence. In the stacking learning approach, the cubist, RF, RIDGE, and SVR models are adopted as base-learners and Gaussian process (GP) as meta-learner. The models' effectiveness is evaluated based on the improvement index, mean absolute error, and symmetric mean absolute percentage error criteria. In most of the cases, the SVR and stacking ensemble learning reach a better performance regarding adopted criteria than compared models. In general, the developed models can generate accurate forecasting, achieving errors in a range of 0.87% - 3.51%, 1.02% - 5.63%, and 0.95% - 6.90% in one, three, and six-days-ahead, respectively. The ranking of models in all scenarios is SVR, stacking ensemble learning, ARIMA, CUBIST, RIDGE, and RF models. The use of evaluated models is recommended to forecasting and monitor the ongoing growth of COVID-19 cases, once these models can assist the managers in the decision-making support systems.

PDF Abstract

Short-term Forecasting of COVID-19 Cases in Brazil: An Analysis of Machine Learning Approaches

The paper "Short-term forecasting COVID-19 cumulative confirmed cases: Perspectives for Brazil" focuses on evaluating the predictive capacity of various machine learning models for forecasting COVID-19 cumulative confirmed cases across ten states in Brazil. These states were selected based on their significant daily incidence rates, providing a critical context for assessing model efficacy. The models examined in this paper include Autoregressive Integrated Moving Average (ARIMA), Cubist (CUBIST), Random Forest (RF), Ridge Regression (RIDGE), Support Vector Regression (SVR), and a Stacking-Ensemble Learning approach.

Models and Methodologies

The paper employs an ensemble learning framework, utilizing a stacking approach where Cubist, RF, RIDGE, and SVR function as base-learners, with Gaussian Process (GP) acting as a meta-learner. The forecasting models are evaluated against criteria such as Improvement Index (IP), Mean Absolute Error (MAE), and Symmetric Mean Absolute Percentage Error (sMAPE). Among the forecasting horizons considered were one-day-ahead (ODA), three-day-ahead (TDA), and six-day-ahead (SDA) forecasts.

Key Findings

Model Performance and Ranking:
- Across multiple cases, the SVR and Stacking-Ensemble models outperformed other models like ARIMA, CUBIST, RIDGE, and RF, demonstrating superior accuracy in generating forecasts.
- The studies demonstrated errors ranging from 0.87% to 3.51% for a one-day-ahead forecast, extending to a range of 0.95% to 6.90% for six-day-ahead forecasts, highlighting the robustness of SVR and Stacking-Ensemble models.
ARIMA Model:
- While ARIMA showed effectiveness in capturing very short-term trends (notably for ODA scenarios), its performance declined for longer forecasting horizons, suggesting a limitation in handling the data's inherent non-linearities for prolonged periods.
Heterogeneous Model Usage:
- The implementation of a stacking-ensemble learning model suggests that the combination of various algorithms can better adapt to and capture the underlying complexities and variabilities within epidemiological time-series data.

Implications and Future Directions

This paper provides valuable implications for developing strategic public health responses in pandemic situations. The insights from employing diverse machine learning models to predict COVID-19 cases can aid health authorities in resource allocation and decision-making processes necessary to mitigate pandemic impacts.

Practical Implications:
- The successful application of these models can help authorities anticipate and prepare for changes in case numbers, allowing timely interventions and adjustments in public health strategies.
Future Research:
- Proposed future research directions include the incorporation of deep learning techniques combined with ensemble models, further exploration into copula functions for data augmentation, and the use of multi-objective optimization in hyperparameter tuning.
- Expanding the forecasting framework to include additional features may also enhance model performance and offer richer insights into pandemic dynamics.

The paper effectively underscores the importance of employing sophisticated analytics tools in epidemiological forecasting, not just for COVID-19 but potentially for other infectious diseases, emphasizing a more robust and adaptive approach to public health planning and pandemic preparedness.

PDF Markdown Bookmark Chat (Pro)

Authors (4)

Matheus Henrique Dal Molin Ribeiro (4 papers)
Ramon Gomes da Silva (3 papers)
Viviana Cocco Mariani (3 papers)
Leandro dos Santos Coelho (7 papers)

Citations (403)

View on Semantic Scholar

Short-term forecasting COVID-19 cumulative confirmed cases: Perspectives for Brazil (2007.12261v1)

Short-term Forecasting of COVID-19 Cases in Brazil: An Analysis of Machine Learning Approaches

Models and Methodologies

Key Findings

Implications and Future Directions

Related Papers