Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
167 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A Two-Stage Feature Selection Approach for Robust Evaluation of Treatment Effects in High-Dimensional Observational Data (2111.13800v2)

Published 27 Nov 2021 in cs.LG, cs.AI, and stat.ME

Abstract: A Randomized Control Trial (RCT) is considered as the gold standard for evaluating the effect of any intervention or treatment. However, its feasibility is often hindered by ethical, economical, and legal considerations, making observational data a valuable alternative for drawing causal conclusions. Nevertheless, healthcare observational data presents a difficult challenge due to its high dimensionality, requiring careful consideration to ensure unbiased, reliable, and robust causal inferences. To overcome this challenge, in this study, we propose a novel two-stage feature selection technique called, Outcome Adaptive Elastic Net (OAENet), explicitly designed for making robust causal inference decisions using matching techniques. OAENet offers several key advantages over existing methods: superior performance on correlated and high-dimensional data compared to the existing methods and the ability to select specific sets of variables (including confounders and variables associated only with the outcome). This ensures robustness and facilitates an unbiased estimate of the causal effect. Numerical experiments on simulated data demonstrate that OAENet significantly outperforms state-of-the-art methods by either producing a higher-quality estimate or a comparable estimate in significantly less time. To illustrate the applicability of OAENet, we employ large-scale US healthcare data to estimate the effect of Opioid Use Disorder (OUD) on suicidal behavior. When compared to competing methods, OAENet closely aligns with existing literature on the relationship between OUD and suicidal behavior. Performance on both simulated and real-world data highlights that OAENet notably enhances the accuracy of estimating treatment effects or evaluating policy decision-making with causal inference.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (56)
  1. Large sample properties of matching estimators for average treatment effects. Econometrica 74, 235–267. URL: https://onlinelibrary.wiley.com/doi/abs/10.1111/j.1468-0262.2006.00655.x, doi:10.1111/j.1468-0262.2006.00655.x.
  2. Suicidal behavior, opioid use disorder, and behavioral health treatment: Prevalence and correlates among adults in the united states 2015–2018. Journal of substance abuse treatment 130, 108413.
  3. American Psychiatric Association, 2013. Diagnostic and statistical manual of mental disorders: DSM-5. 5th ed. ed., Autor, Washington, DC.
  4. Frequency of prescription opioid misuse and suicidal ideation, planning, and attempts. Journal of Psychiatric Research 92, 1–7.
  5. Using methods from the data-mining and machine-learning literature for disease classification and prediction: a case study examining classification of heart failure subtypes. Journal of clinical epidemiology 66, 398–407.
  6. Understanding links among opioid use, overdose, and suicide. New England journal of medicine 380, 71–79.
  7. Variable selection for propensity score models. American journal of epidemiology 163, 1149–1156.
  8. Variable selection in causal inference using a simultaneous penalization method. Journal of Causal Inference 6.
  9. On the grouped selection and model complexity of the adaptive elastic net. Statistics and computing 21, 451–462.
  10. Gender and comorbidity among individuals with opioid use disorders in the nesarc study. Addictive behaviors 34, 498–504.
  11. Statistics and causal inference. Journal of the American Statistical Association 81, 945–960.
  12. Multivariate matching methods that are monotonic imbalance bounding. Journal of the American Statistical Association 106, 345–361. doi:10.1198/jasa.2011.tm09599.
  13. Trends in suicide by level of urbanization—united states, 1999–2015. Morbidity and Mortality Weekly Report 66, 270.
  14. Why propensity scores should not be used for matching. Political analysis 27, 435–454.
  15. Why propensity scores should not be used for matching. Political Analysis 27.
  16. Variable selection and estimation in causal inference using bayesian spike and slab priors. Statistical methods in medical research 29, 2445–2469.
  17. Package ‘boruta’.
  18. Improving propensity score weighting using machine learning. Statistics in medicine 29, 337–346.
  19. Regularization methods for high-dimensional instrumental variables regression with an application to genetical genomics. Journal of the American Statistical Association 110, 270–288.
  20. Suicidal behaviour and associated risk factors among opioid-dependent individuals: A case–control study. Addiction 102, 1933–1941.
  21. A brief history of the randomized controlled trial: From oranges and lemons to the gold standard. Hematology/oncology clinics of North America 14, 745–760.
  22. Suicide attempts in the epidemiologic catchment area study. The Yale Journal of Biology and Medicine 61, 259.
  23. Balance optimization subset selection (boss): An alternative approach for causal inference with observational data. Operations Research 61, 398–412. doi:10.1287/opre.1120.1118.
  24. Adverse events among adult medicaid enrollees with opioid use disorder and co-occurring substance use disorders. Drug and Alcohol Dependence 221, 108555.
  25. Causality. Cambridge university press.
  26. The foundations of causal inference. Sociological Methodology 40, 75–149.
  27. On a class of bias-amplifying variables that endanger effect estimates. arXiv preprint arXiv:1203.3503 .
  28. Adjusting for confounding with text matching. American Journal of Political Science 64, 887–903.
  29. Imposing minimax and quantile constraints on optimal matching in observational studies. Journal of Computational and Graphical Statistics 26, 66–78. doi:10.1080/10618600.2016.1152971.
  30. The central role of the propensity score in observational studies for causal effects. Biometrika 70, 41–55. doi:10.1093/biomet/70.1.41.
  31. Constructing a control group using multivariate matched sampling methods that incorporate the propensity score. The American Statistician 39, 33–38.
  32. A note on overadjustment in inverse probability weighted estimation. Biometrika 97, 997–1001.
  33. The use of matched sampling and regression adjustment to remove bias in observational studies. Biometrics 29. doi:10.2307/2529685.
  34. Using multivariate matched sampling and regression adjustment to control bias in observational studies. Journal of the American Statistical Association 74, 318–328.
  35. Variable selection for confounder control, flexible modeling and collaborative targeted minimum loss-based estimation in causal inference. The international journal of biostatistics 12, 97–115.
  36. A matching method for improving covariate balance in cost-effectiveness analyses. Health economics 21, 695–714.
  37. Outcome-adaptive lasso: Variable selection for causal inference. Biometrics 73, 1111–1122.
  38. National trends in suicide thoughts and behavior among us adults with opioid use disorder from 2015 to 2020. Substance Use & Misuse 57, 876–885.
  39. Matching methods for causal inference: A review and a look forward. Statist. Sci. 25, 1–21. doi:10.1214/09-STS313.
  40. Substance Abuse and Mental Health Services Administration, 2020. National Survey on Drug Use and Health (NSDUH). Accessed: 2022-04-04.
  41. A generalized double robust bayesian model averaging approach to causal effect estimation with application to the study of osteoporotic fractures. arXiv preprint arXiv:2003.11588 .
  42. The bayesian causal effect estimation algorithm. Journal of Causal Inference 3, 207–236.
  43. Package ‘bcee’.
  44. A new criterion for confounder selection. Biometrics 67, 1406–1413.
  45. Bayesian effect estimation accounting for adjustment uncertainty discussions. Biometrics 68, 665–678.
  46. Package ‘bacr’.
  47. Bayesian effect estimation accounting for adjustment uncertainty. Biometrics 68, 661–671.
  48. Flame: A fast large-scale almost matching exactly approach to causal inference. Journal of Machine Learning Research 22, 1–41.
  49. Propensity score estimation: neural networks, support vector machines, decision trees (cart), and meta-classifiers as alternatives to logistic regression. Journal of clinical epidemiology 63, 826–833.
  50. Confounder selection via penalized credible regions. Biometrics 70, 852–861.
  51. Should instrumental variables be used as matching variables? Research in Economics 70, 232–237.
  52. The adaptive lasso and its oracle properties. Journal of the American statistical association 101, 1418–1429.
  53. Regularization and variable selection via the elastic net. Journal of the royal statistical society: series B (statistical methodology) 67, 301–320.
  54. On the adaptive elastic-net with a diverging number of parameters. Annals of statistics 37, 1733.
  55. Using mixed integer programming for matching in an observational study of kidney failure after surgery. Journal of the American Statistical Association 107, 1360–1371. doi:10.1080/01621459.2012.703874.
  56. Stable weights that balance covariates for estimation with incomplete outcome data. Journal of the American Statistical Association 110, 910–922. doi:10.1080/01621459.2015.1023805.

Summary

We haven't generated a summary for this paper yet.