Papers
Topics
Authors
Recent
2000 character limit reached

Quantifying disparities in intimate partner violence: a machine learning method to correct for underreporting

Published 8 Oct 2021 in cs.CY and cs.LG | (2110.04133v4)

Abstract: Estimating the prevalence of a medical condition, or the proportion of the population in which it occurs, is a fundamental problem in healthcare and public health. Accurate estimates of the relative prevalence across groups -- capturing, for example, that a condition affects women more frequently than men -- facilitate effective and equitable health policy which prioritizes groups who are disproportionately affected by a condition. However, it is difficult to estimate relative prevalence when a medical condition is underreported. In this work, we provide a method for accurately estimating the relative prevalence of underreported medical conditions, building upon the positive unlabeled learning framework. We show that under the commonly made covariate shift assumption -- i.e., that the probability of having a disease conditional on symptoms remains constant across groups -- we can recover the relative prevalence, even without restrictive assumptions commonly made in positive unlabeled learning and even if it is impossible to recover the absolute prevalence. We conduct experiments on synthetic and real health data which demonstrate our method's ability to recover the relative prevalence more accurately than do baselines, and demonstrate the method's robustness to plausible violations of the covariate shift assumption. We conclude by illustrating the applicability of our method to case studies of intimate partner violence and hate speech.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (91)
  1. Johnson, A. et al. Mimic-iv (version 0.4). PhysioNet (2020).
  2. The study examined emergency department visits for diabetes using discharge data from the nationwide emergency department sample (neds).
  3. Johnson, A. et al. Mimic-iv-ed (2021).
  4. Trends in maternal mortality by socio-demographic characteristics and cause of death in 27 states and the district of columbia. Obstetrics and gynecology 129, 811 (2017).
  5. FACT SHEET: Biden-Harris Administration Announces Initial Actions to Address the Black Maternal Health Crisis. https://www.whitehouse.gov/briefing-room/statements-releases/2021/04/13/fact-sheet-biden-harris-administration-announces-initial-actions-to-address-the-black-maternal-health-crisis/. Accessed: 2022-10-16.
  6. Geiger, H. J. Racial and ethnic disparities in diagnosis and treatment: a review of the evidence and a consideration of causes. Unequal treatment: Confronting racial and ethnic disparities in health care 417 (2003).
  7. Using diagnostic codes to screen for intimate partner violence in oregon emergency departments and hospitals. Public Health Reports 123, 628–635 (2008).
  8. Lyles, R. H. et al. Validation data-based adjustments for outcome misclassification in logistic regression: an illustration. Epidemiology (Cambridge, Mass.) 22, 589 (2011).
  9. On a method of estimating birth and death rates and the extent of registration. Journal of the American statistical Association 44, 101–115 (1949).
  10. A plan for estimating the number of “hardcore” drug users in the united states. International journal of the addictions 30, 637–657 (1995).
  11. Estimating the number of drug injectors from needle exchange data. Addiction Research & Theory 11, 235–243 (2003).
  12. Female streetworking prostitution and hiv infection in glasgow. British Medical Journal 305, 801–804 (1992).
  13. Estimating the error rates of diagnostic tests. Biometrics 167–171 (1980).
  14. Estimation of test error rates, disease prevalence and relative risk from misclassified data: a review. Journal of clinical epidemiology 41, 923–937 (1988).
  15. Insights into latent class analysis of diagnostic test performance. Biostatistics 8, 474–484 (2007).
  16. Association between covariates and disease occurrence in the presence of diagnostic error. Epidemiology & Infection 140, 1515–1524 (2012).
  17. Machine learning and prediction in medicine—beyond the peak of inflated expectations. The New England journal of medicine 376, 2507 (2017).
  18. United states emergency department visits coded for intimate partner violence. The Journal of emergency medicine 48, 94–100 (2015).
  19. Bureau of justice statistics special report: Intimate partner violence. retrieved november 12, 2007 (2000).
  20. Intimate partner violence and neighborhood income: a longitudinal analysis. Violence Against Women 20, 42–58 (2014).
  21. Abramsky, T. et al. Women’s income and risk of intimate partner violence: secondary findings from the maisha cluster randomised trial in north-western tanzania. BMC public health 19, 1–15 (2019).
  22. Learning from positive and unlabeled data: A survey. Machine Learning 109, 719–760 (2020).
  23. Covariate shift adaptation by importance weighted cross validation. Journal of Machine Learning Research 8 (2007).
  24. Dataset shift in machine learning (Mit Press, 2009).
  25. Nestor, B. et al. Feature robustness in non-stationary health records: caveats to deployable model performance in common clinical machine learning tasks. In Machine Learning for Healthcare Conference, 381–405 (PMLR, 2019).
  26. Beyond the selected completely at random assumption for learning from positive and unlabeled data. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases, 71–85 (Springer, 2019).
  27. Riley, W. J. Health disparities: gaps in access, quality and affordability of medical care. Transactions of the American Clinical and Climatological Association 123, 167 (2012).
  28. Alessandrino, F. et al. Intimate partner violence: a primer for radiologists to make the “invisible” visible. Radiographics 40, 2080–2097 (2020).
  29. Houry, D. et al. Differences in female and male victims and perpetrators of partner violence with respect to web scores. Journal of interpersonal violence 23, 1041–1055 (2008).
  30. Louwers, E. C. et al. Detection of child abuse in emergency departments: a multi-centre study. Archives of disease in childhood 96, 422–425 (2011).
  31. Demographics and fracture patterns of patients presenting to us emergency departments for intimate partner violence. JAAOS Global Research & Reviews 4 (2020).
  32. Amjad, H. et al. Underdiagnosis of dementia: an observational study of patterns in diagnosis and awareness in us older adults. Journal of general internal medicine 33, 1131–1138 (2018).
  33. Missed and delayed diagnosis of dementia in primary care: prevalence and contributing factors. Alzheimer Disease & Associated Disorders 23, 306–314 (2009).
  34. Underdiagnosis and overdiagnosis of asthma. American journal of respiratory and critical care medicine 198, 1012–1020 (2018).
  35. Racial and ethnic disparities in diagnosed and possible undiagnosed asthma among public-school children in chicago. American journal of public health 96, 1599–1603 (2006).
  36. 200 underdiagnosis of depression among low-income, predominantly latino, type 2 diabetics. (2007).
  37. Underdiagnosis of depression in an economically deprived population in m acao, c hina. Asia-Pacific Psychiatry 8, 70–79 (2016).
  38. Sorkin, D. H. et al. Underdiagnosed and undertreated depression among racially/ethnically diverse patients with type 2 diabetes. Diabetes care 34, 598–600 (2011).
  39. The experience of intimate partner violence among older women: A narrative review. Maturitas 121, 63–75 (2019).
  40. Diagnosis of elder abuse in us emergency departments. Journal of the American Geriatrics Society 65, 91–97 (2017).
  41. Intimate partner violence in the golden age: Systematic review of risk and protective factors. Frontiers in psychology 9, 1595 (2018).
  42. Neighborhood poverty as a predictor of intimate partner violence among white, black, and hispanic couples in the united states: A multilevel analysis. Annals of epidemiology 10, 297–308 (2000).
  43. Changes in incidents and payment methods for intimate partner violence related injuries in women residing in the united states, 2002 to 2015. Women’s health issues 30, 338–344 (2020).
  44. Wong, J. Y.-H. et al. A comparison of intimate partner violence and associated physical injuries between cohabitating and married women: a 5-year medical chart review. BMC Public Health 16, 1–9 (2016).
  45. Abramsky, T. et al. What factors are associated with recent intimate partner violence? findings from the who multi-country study on women’s health and domestic violence. BMC public health 11, 1–17 (2011).
  46. A systematic review of risk factors for intimate partner violence. Partner abuse 3, 231–280 (2012).
  47. Survivors’ experiences of intimate partner violence and shelter utilization during covid-19. Journal of family violence 37, 979–990 (2022).
  48. Intimate partner violence in small towns, dispersed rural areas, and other locations: Estimates using a reconception of settlement type. Rural sociology 84, 826–852 (2019).
  49. Racial and ethnic disparities in police-reported intimate partner violence and risk of hospitalization among women. Women’s health issues 19, 109–118 (2009).
  50. Cho, H. Racial differences in the prevalence of intimate partner violence against women and associated factors. Journal of Interpersonal Violence 27, 344–363 (2012).
  51. 2020 report on the intersection of domestic violence, race/ethnicity and sex (2020).
  52. Practical implications of current intimate partner violence research for victim advocates and service providers (2013).
  53. Polycystic ovarian syndrome: an under-recognised problem? British Journal of General Practice 68, 244–244 (2018).
  54. Agarwal, S. K. et al. Clinical diagnosis of endometriosis: a call to action. American journal of obstetrics and gynecology 220, 354–e1 (2019).
  55. Evaluation and treatment of mild traumatic brain injury: the role of neuropsychology. Brain sciences 7, 105 (2017).
  56. Learning classifiers from only positive and unlabeled data. In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, 213–220 (2008).
  57. Estimating the class prior in positive and unlabeled data through decision tree induction. In Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32 (2018).
  58. Estimating the class prior and posterior from noisy positives and unlabeled data. Advances in neural information processing systems 29 (2016).
  59. Class prior estimation from positive and unlabeled data. IEICE TRANSACTIONS on Information and Systems 97, 1358–1362 (2014).
  60. Learning with confident examples: Rank pruning for robust classification with noisy labels. arXiv preprint arXiv:1705.01936 (2017).
  61. Mixture proportion estimation via kernel embeddings of distributions. In International conference on machine learning, 2052–2060 (PMLR, 2016).
  62. Probabilistic machine learning for healthcare. Annual Review of Biomedical Data Science 4, 393–415 (2021).
  63. Paszke, A. et al. Automatic differentiation in pytorch (2017).
  64. Steyerberg, E. W. et al. Assessing the performance of prediction models: a framework for some traditional and novel measures. Epidemiology (Cambridge, Mass.) 21, 128 (2010).
  65. Impact of endometriosis on women’s lives: a qualitative study. BMC women’s health 14, 1–12 (2014).
  66. An unusual cause of abdominal pain in a male patient: Endometriosis. Avicenna Journal of Medicine 4 (2014).
  67. Pattern of physical injury associated with intimate partner violence in women presenting to the emergency department: a systematic review and meta-analysis. Trauma, Violence, & Abuse 11, 71–82 (2010).
  68. Estimating the class prior for positive and unlabelled data via logistic regression. Advances in Data Analysis and Classification 1–30 (2021).
  69. Positive and unlabeled learning algorithms and applications: A survey. In 2019 10th International Conference on Information, Intelligence, Systems and Applications (IISA), 1–8 (IEEE, 2019).
  70. Different strategies of fitting logistic regression for positive and unlabelled data. In International Conference on Computational Science, 3–17 (Springer, 2020).
  71. Joint estimation of posterior probability and propensity score function for positive and unlabelled data. arXiv preprint arXiv:2209.07787 (2022).
  72. Ivanov, D. Dedpul: Difference-of-estimated-densities-based positive-unlabeled learning. In 2020 19th IEEE International Conference on Machine Learning and Applications (ICMLA), 782–790 (IEEE, 2020).
  73. Mixture proportion estimation and pu learning: A modern approach. Advances in Neural Information Processing Systems 34, 8532–8544 (2021).
  74. Lash, T. L. et al. Good practices for quantitative bias analysis. International journal of epidemiology 43, 1969–1985 (2014).
  75. Bias. Journal of Epidemiology & Community Health 58, 635–641 (2004).
  76. Bias due to misclassification in the estimation of relative risk. American journal of epidemiology 105, 488–495 (1977).
  77. Correcting for misclassification in two-way tables and matched-pair studies. International Journal of Epidemiology 12, 93–97 (1983).
  78. Kristensen, P. Bias from nondifferential but dependent misclassification of exposure and outcome. Epidemiology 210–215 (1992).
  79. Neuhaus, J. M. Bias and efficiency loss due to misclassified responses in binary regression. Biometrika 86, 843–855 (1999).
  80. Exact inference for the risk ratio with an imperfect diagnostic test. Epidemiology & Infection 145, 187–193 (2017).
  81. Logistic regression when the outcome is measured with uncertainty. American journal of epidemiology 146, 195–203 (1997).
  82. Diggle, P. J. Estimating prevalence using an imperfect test. Epidemiology Research International 2011 (2011).
  83. A tutorial in estimating the prevalence of disease in humans and animals in the absence of a gold standard diagnostic. Emerging themes in epidemiology 9, 1–8 (2012).
  84. Selection and misclassification biases in longitudinal studies. Frontiers in veterinary science 5, 99 (2018).
  85. Scott, C. A rate of convergence for mixture proportion estimation, with application to learning from noisy labels. In Artificial Intelligence and Statistics, 838–846 (PMLR, 2015).
  86. Multi-dimensional density estimation handbook of statistics vol 23 data mining and computational statistics ed cr rao and ej wegman (2004).
  87. Class-prior estimation for learning from positive and unlabeled data. In Asian Conference on Machine Learning, 221–236 (PMLR, 2016).
  88. Classification with noisy labels by importance reweighting. IEEE Transactions on pattern analysis and machine intelligence 38, 447–461 (2015).
  89. Semi-supervised novelty detection. The Journal of Machine Learning Research 11, 2973–3009 (2010).
  90. Leone, M. et al. Social network analysis to characterize women victims of violence. BMC public health 19, 1–11 (2019).
  91. A protocol to diagnose intimate partner violence in the emergency department. Journal of Trauma and Acute Care Surgery 60, 1101–1105 (2006).
Citations (8)

Summary

We haven't generated a summary for this paper yet.

Whiteboard

Paper to Video (Beta)

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.