DemOpts: Fairness corrections in COVID-19 case prediction models (2405.09483v2)
Abstract: COVID-19 forecasting models have been used to inform decision making around resource allocation and intervention decisions e.g., hospital beds or stay-at-home orders. State of the art deep learning models often use multimodal data such as mobility or socio-demographic data to enhance COVID-19 case prediction models. Nevertheless, related work has revealed under-reporting bias in COVID-19 cases as well as sampling bias in mobility data for certain minority racial and ethnic groups, which could in turn affect the fairness of the COVID-19 predictions along race labels. In this paper, we show that state of the art deep learning models output mean prediction errors that are significantly different across racial and ethnic groups; and which could, in turn, support unfair policy decisions. We also propose a novel de-biasing method, DemOpts, to increase the fairness of deep learning based forecasting models trained on potentially biased datasets. Our results show that DemOpts can achieve better error parity that other state of the art de-biasing approaches, thus effectively reducing the differences in the mean error distributions across more racial and ethnic groups.
- Analysis of performance improvements and bias associated with the use of human mobility data in COVID-19 case prediction models. ACM Journal on Computing and Sustainable Societies.
- Fair Regression: Quantitative Definitions and Reduction-based Algorithms.
- COVID-19 underreporting and its impact on vaccination strategies. BMC Infectious Diseases, 21: 1–13.
- Estimation of US SARS-CoV-2 infections, symptomatic infections, hospitalizations, and deaths using seroprevalence surveys. JAMA network open, 4(1): e2033706–e2033706.
- Apple. 2022. COVID-19 Mobility Trends Reports.
- Interpretable sequence learning for COVID-19 forecasting. Advances in Neural Information Processing Systems, 33: 18807–18818.
- Limitations of using mobile phone data to model COVID-19 transmission in the USA. The Lancet Infectious Diseases, 21(5): e113.
- Using mobile phone data to predict the spatial spread of cholera. Scientific reports, 5(1): 1–5.
- A convex framework for fair regression. arXiv preprint arXiv:1706.02409.
- Non-normal data: Is ANOVA still a valid option? Psicothema, 2017, vol. 29, num. 4, p. 552-557.
- Understanding the Origins of Bias in Word Embeddings.
- Optimized Data Pre-Processing for Discrimination Prevention.
- A clarification of the nuances in the fairness metrics landscape. Scientific Reports, 12(1): 4209.
- CDC. 2020. Forecast Hub. https://covid19forecasthub.org/.
- CDC. 2023. Health Disparities. Last accessed January 2024.
- Hawkes process modeling of COVID-19 with mobility leading indicators and spatial covariates (preprint).
- Leveraging Administrative Data for Bias Audits: Assessing Disparate Coverage with Mobility Data for COVID-19 Policy. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, FAccT ’21, 173–184. New York, NY, USA: Association for Computing Machinery. ISBN 9781450383097.
- Fairer and More Accurate Tabular Models Through NAS.
- Invisibilidad de los latinos en la pandemia. AMA Journal of Ethics, 289–295.
- BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. CoRR, abs/1810.04805.
- An interactive web-based dashboard to track COVID-19 in real time. The Lancet Infectious Diseases, 20(5): 533–534.
- An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. arXiv:2010.11929.
- Variation in reporting of the race and ethnicity of COVID-19 cases and deaths across US states: April 12, 2020, and November 9, 2020. American Journal of Public Health, 111(6): 1141–1148.
- Deep convolutional neural networks for spatiotemporal crime prediction. In Proceedings of the International Conference on Information and Knowledge Engineering (IKE), 61–67. csce.ucmss.com.
- A fairness assessment of mobility-based COVID-19 case prediction models.
- An objective framework for evaluating unrecognized bias in medical AI models predicting COVID-19 outcomes. Journal of the American Medical Informatics Association, 29(8): 1334–1341.
- A General Framework for Fair Regression.
- for Disease Control, C.; and Prevention. 2023. COVID-19 Forecasting and Mathematical Modeling. Accessed: 2023-12-25.
- Computing cost-effective census maps from cell phone traces. In Workshop on pervasive urban applications.
- Cell phone analytics: Scaling human behavior studies into the millions. Information Technologies & International Development, 9(2): pp–35.
- Socio-economic levels and human mobility. In Qual meets quant workshop-QMQ, 1–6.
- Mobilizing education: evaluation of a mobile learning tool in a low-income school. In Proceedings of the 14th international conference on Human-computer interaction with mobile devices and services, 441–450.
- Identifying spatiotemporal urban activities through linguistic signatures. Computers, Environment and Urban Systems, 72: 25–37.
- A random forest model for forecasting regional COVID-19 cases utilizing reproduction number estimates and demographic data. Chaos, Solitons & Fractals, 156: 111779.
- Associations between phone mobility data and COVID-19 cases. The Lancet Infectious Diseases, 21(5): e111.
- A framework to model human behavior at large scale during natural disasters. In 2016 17th IEEE International Conference on Mobile Data Management (MDM), volume 1, 18–27. IEEE.
- Google. 2022. COVID-19 Community Mobility Reports.
- Racial and ethnic disparities in population-level Covid-19 mortality. Journal of general internal medicine, 35: 3097–3099.
- Error Parity Fairness: Testing for Group Fairness in Regression Tasks. arXiv preprint arXiv:2208.08279.
- Estimating poverty using cell phone data: evidence from Guatemala.
- Long short-term memory. Neural computation, 9(8): 1735–1780.
- Topic models to infer socio-economic maps. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 30.
- Modeling and predicting evacuation flows during hurricane Irma. EPJ Data Science, 9(1): 29.
- Understanding citizens’ and local governments’ digital communications during natural disasters: the case of snowstorms. In Proceedings of the 2017 ACM on web science conference, 141–150.
- Modeling human migration patterns during drought conditions in La Guajira, Colombia. In Proceedings of the 1st ACM SIGCAS conference on computing and sustainable societies, 1–9.
- Correcting under-reported COVID-19 case numbers: estimating the true scale of the pandemic. medRxiv, 2020–03.
- Participatory approaches to addressing missing COVID-19 race and ethnicity data. International Journal of Environmental Research and Public Health, 18(12): 6559.
- Multiscale dynamic human mobility flow dataset in the US during the COVID-19 epidemic. Scientific data, 7(1): 390.
- FACT: A diagnostic for group fairness trade-offs. In International Conference on Machine Learning, 5264–5274. PMLR.
- Segment Anything. arXiv:2304.02643.
- Labs, D. 2023. Mobility changes in response to COVID-19.
- Transformers in Speech Processing: A Survey. arXiv:2303.11607.
- Neural relational autoregression for high-resolution COVID-19 forecasting. Facebook AI Research.
- Enhancing the Locality and Breaking the Memory Bottleneck of Transformer on Time Series Forecasting. In Wallach, H.; Larochelle, H.; Beygelzimer, A.; d'Alché-Buc, F.; Fox, E.; and Garnett, R., eds., Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc.
- Temporal fusion transformers for interpretable multi-horizon time series forecasting. International Journal of Forecasting, 37(4): 1748–1764.
- Temporal Fusion Transformers for interpretable multi-horizon time series forecasting. International Journal of Forecasting, 37(4): 1748–1764.
- TAPEX: Table Pre-training via Learning a Neural SQL Executor. CoRR, abs/2107.07653.
- Pyraformer: Low-Complexity Pyramidal Attention for Long-Range Time Series Modeling and Forecasting. In International Conference on Learning Representations.
- A spatiotemporal machine learning approach to forecasting COVID-19 incidence at the county level in the USA. International Journal of Data Science and Analytics, 15(3): 247–266.
- Machine learning-based research for covid-19 detection, diagnosis, and prediction: A survey. SN computer science, 3(4): 286.
- Nature. 2019. 5 tips for dealing with non-significant results. Accessed: 2024-1-12.
- A Time Series is Worth 64 Words: Long-term Forecasting with Transformers. In International Conference on Learning Representations.
- Initial simulation of SARS-CoV2 spread and intervention effects in the continental US. MedRxiv, 2020–03.
- Improving language understanding by generative pre-training.
- Human mobility in advanced and developing economies: A comparative analysis. In 2010 AAAI Spring Symposium Series.
- Biases in human mobility data impact epidemic modeling.
- Selective Regression under Fairness Criteria. In Chaudhuri, K.; Jegelka, S.; Song, L.; Szepesvari, C.; Niu, G.; and Sabato, S., eds., Proceedings of the 39th International Conference on Machine Learning, volume 162 of Proceedings of Machine Learning Research, 19598–19615. PMLR.
- A COVID-19 Community Vulnerability Index to drive precision policy in the US. medRxiv.
- Times, T. N. Y. 2021. Coronavirus (Covid-19) Data in the United States. https://github.com/nytimes/covid-19-data. Accessed: 2023-12-15.
- Algorithmic fairness in pandemic forecasting: lessons from COVID-19.
- Attention Is All You Need.
- Graph attention networks. stat, 1050(20): 10–48550.
- Querying spatio-temporal patterns in mobile phone-call databases. In 2010 Eleventh International Conference on Mobile Data Management, 239–248. IEEE.
- Mitigating demographic bias of machine learning models on social media.
- Structure-aware Pre-training for Table Understanding with Tree-based Transformers. CoRR, abs/2010.12537.
- Quantifying the impact of human mobility on malaria. Science, 338(6104): 267–270.
- Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting. CoRR, abs/2106.13008.
- Enhancing short-term crime prediction with human mobility flows and deep learning architectures. EPJ Data Science, 11(1): 53.
- Auditing the fairness of place-based crime prediction models implemented with deep learning approaches. Computers, Environment and Urban Systems, 102: 101967.
- Spatial sensitivity analysis for urban hotspots using cell phone traces. Environment and Planning B: Urban Analytics and City Science.
- FORML: Learning to Reweight Data for Fairness.
- Algorithmic Fairness and Bias Mitigation for Clinical Machine Learning: A New Utility for Deep Reinforcement Learning. medRxiv.
- TaBERT: Pretraining for Joint Understanding of Textual and Tabular Data. In Annual Conference of the Association for Computational Linguistics (ACL).
- Spatio-Temporal Graph Convolutional Networks: A Deep Learning Framework for Traffic Forecasting. In Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI-18, 3634–3640. International Joint Conferences on Artificial Intelligence Organization.
- Spatio-Temporal Graph Convolutional Networks: A Deep Learning Framework for Traffic Forecasting.(2018).
- A seq2seq model to forecast the COVID-19 cases, deaths and reproductive R numbers in US counties. Research Square.
- Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting. CoRR, abs/2012.07436.
- FEDformer: Frequency Enhanced Decomposed Transformer for Long-term Series Forecasting. CoRR, abs/2201.12740.
- Zimmerman, D. W. 1987. Comparative power of Student t test and Mann-Whitney U test for unequal sample sizes and variances. The Journal of Experimental Educational, 171–174.
- Epidemic model guided machine learning for COVID-19 forecasts in the United States. MedRxiv, 2020–05.
- Naman Awasthi (7 papers)
- Saad Abrar (2 papers)
- Daniel Smolyak (10 papers)
- Vanessa Frias-Martinez (13 papers)