Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
149 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Positivity-free Policy Learning with Observational Data (2310.06969v1)

Published 10 Oct 2023 in stat.ME, cs.LG, and stat.ML

Abstract: Policy learning utilizing observational data is pivotal across various domains, with the objective of learning the optimal treatment assignment policy while adhering to specific constraints such as fairness, budget, and simplicity. This study introduces a novel positivity-free (stochastic) policy learning framework designed to address the challenges posed by the impracticality of the positivity assumption in real-world scenarios. This framework leverages incremental propensity score policies to adjust propensity score values instead of assigning fixed values to treatments. We characterize these incremental propensity score policies and establish identification conditions, employing semiparametric efficiency theory to propose efficient estimators capable of achieving rapid convergence rates, even when integrated with advanced machine learning algorithms. This paper provides a thorough exploration of the theoretical guarantees associated with policy learning and validates the proposed framework's finite-sample performance through comprehensive numerical experiments, ensuring the identification of causal effects from observational data is both robust and reliable.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (64)
  1. Policy learning with observational data. Econometrica, 89(1):133–161, 2021.
  2. Generalized random forests. The Annals of Statistics, 47(2):1148 – 1178, 2019. doi: 10.1214/18-AOS1709. URL https://doi.org/10.1214/18-AOS1709.
  3. Risk minimization from adaptively collected data: Guarantees for supervised and policy learning. Advances in neural information processing systems, 34:19261–19273, 2021.
  4. Efficient and adaptive estimation for semiparametric models, volume 4. Springer, 1993.
  5. Causal effect estimation after propensity score trimming with continuous treatments. arXiv preprint arXiv:2309.00706, 2023.
  6. Building classifiers with independency constraints. In 2009 IEEE international conference on data mining workshops, pages 13–18. IEEE, 2009.
  7. Double/debiased machine learning for treatment and structural parameters. The Econometrics Journal, 21(1):C1–C68, 01 2018. ISSN 1368-4221.
  8. Multiply robust off-policy evaluation and learning under truncation by death. In Andreas Krause, Emma Brunskill, Kyunghyun Cho, Barbara Engelhardt, Sivan Sabato, and Jonathan Scarlett, editors, Proceedings of the 40th International Conference on Machine Learning, volume 202 of Proceedings of Machine Learning Research, pages 6195–6227. PMLR, 23–29 Jul 2023.
  9. Fair regression with wasserstein barycenters. In H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin, editors, Advances in Neural Information Processing Systems, volume 33, pages 7321–7331. Curran Associates, Inc., 2020.
  10. Staggered rollout designs enable causal inference under interference without network knowledge. In S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, editors, Advances in Neural Information Processing Systems, volume 35, pages 7437–7449. Curran Associates, Inc., 2022.
  11. Tree based weighted learning for estimating individualized treatment rules with censored data. Electronic journal of statistics, 11(2):3927, 2017.
  12. Estimating heterogeneous treatment effects with right-censored data via causal survival forests. Journal of the Royal Statistical Society Series B: Statistical Methodology, 85(2):179–211, 2023.
  13. Doubly robust policy evaluation and optimization. Statistical Science, 29:485–511, 2014.
  14. Fairness-oriented learning for optimal individualized treatment rules. Journal of the American Statistical Association, pages 1–14, 2022.
  15. Orthogonal statistical learning. The Annals of Statistics, 51(3):879–908, 2023.
  16. Causal estimation for text data with (apparent) overlap violations. In International Conference on Learning Representations, 2023.
  17. Estimation of the effect of interventions that modify the received treatment. Statistics in medicine, 32(30):5260–5277, 2013.
  18. Equality of opportunity in supervised learning. Advances in neural information processing systems, 29, 2016.
  19. Causal inference in statistics, social, and biomedical sciences. Cambridge University Press, 2015.
  20. Bayesian safe policy learning with chance constrained optimization: Application to military security assessment during the vietnam war. arXiv preprint arXiv:2307.08840, 2023.
  21. Policy learning “without” overlap: Pessimism and generalized empirical bernstein’s inequality. arXiv preprint arXiv:2212.09900, 2022.
  22. Robust and agnostic learning of conditional distributional treatment effects. In International Conference on Artificial Intelligence and Statistics, pages 6037–6060. PMLR, 2023.
  23. Efficient evaluation of natural stochastic policies in offline reinforcement learning. arXiv preprint arXiv:2006.03886, 2020.
  24. Confounding-robust policy improvement. Advances in neural information processing systems, 31, 2018.
  25. Edward H Kennedy. Nonparametric causal effects based on incremental propensity score interventions. Journal of the American Statistical Association, 114(526):645–656, 2019.
  26. Edward H Kennedy. Semiparametric doubly robust targeted double machine learning: a review. arXiv preprint arXiv:2203.06469, 2022.
  27. Sharp instruments for classifying compliers and generalizing causal effects. The Annals of Statistics, 48(4):2008 – 2030, 2020.
  28. Off-policy evaluation beyond overlap: partial identification through smoothness. arXiv preprint arXiv:2305.11812, 2023.
  29. Metalearners for estimating heterogeneous treatment effects using machine learning. Proceedings of the national academy of sciences, 116(10):4156–4165, 2019.
  30. Counterfactual learning from bandit feedback under deterministic logging: A case study in statistical machine translation. arXiv preprint arXiv:1707.09118, 2017.
  31. Policy learning under biased sample selection. arXiv preprint arXiv:2304.11735, 2023.
  32. Trustworthy policy learning under the counterfactual no-harm criterion. In Andreas Krause, Emma Brunskill, Kyunghyun Cho, Barbara Engelhardt, Sivan Sabato, and Jonathan Scarlett, editors, Proceedings of the 40th International Conference on Machine Learning, volume 202 of Proceedings of Machine Learning Research, pages 20575–20598. PMLR, 23–29 Jul 2023.
  33. Average treatment effect on the treated, under lack of positivity. arXiv preprint arXiv:2309.01334, 2023.
  34. Alexander R Luedtke and Mark J van der Laan. Optimal individualized treatments in resource-limited settings. The international journal of biostatistics, 12(1):283–303, 2016a.
  35. Alexander R Luedtke and Mark J van der Laan. Statistical inference for the mean outcome under a possibly non-unique optimal treatment strategy. Annals of statistics, 44(2):713, 2016b.
  36. Iván Díaz Muñoz and Mark van Der Laan. Population intervention causal effects based on stochastic interventions. Biometrics, 68(2):541–549, 2012.
  37. Susan A Murphy. Optimal dynamic treatment regimes. Journal of the Royal Statistical Society Series B: Statistical Methodology, 65(2):331–355, 2003.
  38. Marginal mean models for dynamic regimes. Journal of the American Statistical Association, 96(456):1410–1423, 2001.
  39. Jersey Neyman. Sur les applications de la théorie des probabilités aux experiences agricoles: Essai des principes. Roczniki Nauk Rolniczych, 10(1):1–51, 1923.
  40. Quasi-oracle estimation of heterogeneous treatment effects. Biometrika, 108(2):299–319, 2021.
  41. Michael JD Powell. A direct search optimization method that models the objective and constraint functions by linear interpolation. Springer, 1994.
  42. Optimal individualized decision rules using instrumental variable methods. Journal of the American Statistical Association, 116(533):174–191, 2021.
  43. Donald B Rubin. Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of educational Psychology, 66(5):688, 1974.
  44. Genetic optimization using derivatives. Political Analysis, 7:187–210, 1998.
  45. Alexander Shapiro. Asymptotic analysis of stochastic programs. Annals of Operations Research, 30:169–186, 1991.
  46. High-dimensional A𝐴Aitalic_A-learning for optimal dynamic treatment regimes. The Annals of Statistics, 46(3):925 – 957, 2018.
  47. Impact of hba1c measurement on hospital readmission rates: analysis of 70,000 clinical database patient records. BioMed research international, 2014, 2014.
  48. Jin Tian. Identifying dynamic sequential plans. In Proceedings of the Twenty-Fourth Conference on Uncertainty in Artificial Intelligence, 2008.
  49. Dynamic treatment regimes: Statistical methods for precision medicine. CRC press, 2019.
  50. A review of off-policy evaluation in reinforcement learning. arXiv preprint arXiv:2212.06355, 2022.
  51. Mark J van der Laan and Maya L Petersen. Causal effect models for realistic individualized treatment and intention to treat rules. The international journal of biostatistics, 3(1), 2007.
  52. Mark J van der Laan and James M Robins. Unified methods for censored longitudinal data and causality. Springer, 2003.
  53. Targeted learning: causal inference for observational and experimental data, volume 4. Springer, 2011.
  54. Estimation and inference of heterogeneous treatment effects using random forests. Journal of the American Statistical Association, 113(523):1228–1242, 2018.
  55. Fairlearn: Assessing and improving fairness of ai systems. Journal of Machine Learning Research, 24(257):1–8, 2023.
  56. Inference on the best policies with many covariates. Journal of Econometrics, page 105460, 2023. ISSN 0304-4076.
  57. Asymptotic inference of causal effects with observational studies trimmed by the estimated propensity scores. Biometrika, 105(2):487–493, 03 2018. URL https://doi.org/10.1093/biomet/asy008.
  58. Identification, estimation and approximation of risk under interventions that depend on the natural value of treatment using observational data. Epidemiologic methods, 3(1):1–19, 2014.
  59. Estimating the total treatment effect in randomized experiments with unknown network structure. Proceedings of the National Academy of Sciences, 119(44):e2208975119, 2022.
  60. A robust method for estimating optimal treatment regimes. Biometrics, 68(4):1010–1018, 2012.
  61. Semi-supervised causal inference: Generalizable and double robust inference for average treatment effects under selection bias with decaying overlap. arXiv preprint arXiv:2305.12789, 2023.
  62. Efficient and robust transfer learning of optimal individualized treatment regimes with right-censored survival data. arXiv preprint arXiv:2301.05491, 2023.
  63. Wenjing Zheng and Mark J. van der Laan. Asymptotic theory for cross-validated targeted maximum likelihood estimation. Working Paper Series Working Paper 273, U.C. Berkeley Division of Biostatistics, November 2010. URL https://biostats.bepress.com/ucbbiostat/paper273.
  64. Offline multi-action policy learning: Generalization and optimization. Operations Research, 71(1):148–183, 2023.
Citations (4)

Summary

We haven't generated a summary for this paper yet.