Machine Learning Who to Nudge: Causal vs Predictive Targeting in a Field Experiment on Student Financial Aid Renewal (2310.08672v2)
Abstract: In many settings, interventions may be more effective for some individuals than others, so that targeting interventions may be beneficial. We analyze the value of targeting in the context of a large-scale field experiment with over 53,000 college students, where the goal was to use "nudges" to encourage students to renew their financial-aid applications before a non-binding deadline. We begin with baseline approaches to targeting. First, we target based on a causal forest that estimates heterogeneous treatment effects and then assigns students to treatment according to those estimated to have the highest treatment effects. Next, we evaluate two alternative targeting policies, one targeting students with low predicted probability of renewing financial aid in the absence of the treatment, the other targeting those with high probability. The predicted baseline outcome is not the ideal criterion for targeting, nor is it a priori clear whether to prioritize low, high, or intermediate predicted probability. Nonetheless, targeting on low baseline outcomes is common in practice, for example because the relationship between individual characteristics and treatment effects is often difficult or impossible to estimate with historical data. We propose hybrid approaches that incorporate the strengths of both predictive approaches (accurate estimation) and causal approaches (correct criterion); we show that targeting intermediate baseline outcomes is most effective in our specific application, while targeting based on low baseline outcomes is detrimental. In one year of the experiment, nudging all students improved early filing by an average of 6.4 percentage points over a baseline average of 37% filing, and we estimate that targeting half of the students using our preferred policy attains around 75% of this benefit.
- Ascarza, Eva (2018). Retention futility: Targeting High-Risk customers might be ineffective. J. Mark. Res., 55(1):80–98.
- Semiparametric estimation of treatment effects in randomized experiments. Technical report, National Bureau of Economic Research.
- Athey, Susan and Guido Imbens (2016). Recursive partitioning for heterogeneous causal effects. Proc. Natl. Acad. Sci. U. S. A., 113(27):7353–7360.
- Athey, Susan and Guido W Imbens (2019). Machine learning methods that economists should know about. Annu. Rev. Econom., 11(1):685–725.
- Generalized random forests. Ann. Stat.
- Athey, Susan and Stefan Wager (2021). Policy learning with observational data. Econometrica, 89(1):133–161.
- Castleman, Benjamin L and Lindsay C Page (2016). Freshman year financial aid nudges: An experiment to increase FAFSA renewal and college persistence. J. Hum. Resour., 51(2):389–415.
- Generic machine learning inference on heterogeneous treatment effects in randomized experiments, with an application to immunization in india. Technical report, National Bureau of Economic Research.
- Generic machine learning inference on heterogenous treatment effects in randomized experiments.
- Why you should stop predicting customer churn and start using uplift models. Information Sciences, 548:497–515.
- Fernández-Loría, Carlos and Foster Provost (2022). Causal classification: Treatment effect estimation vs. outcome prediction. The Journal of Machine Learning Research, 23(1):2573–2607.
- Targeting impact versus deprivation. Technical report, National Bureau of Economic Research.
- Heterogeneous treatment effects and optimal targeting policy evaluation. SSRN Electronic Journal.
- ideas42 (2016). Meeting the FAFSA priority deadline. Technical report.
- Imai, Kosuke and Marc Ratkovic (2013). Estimating treatment effect heterogeneity in randomized program evaluation.
- Machine-learning-based high-benefit approach versus conventional high-risk approach in blood pressure management. International Journal of Epidemiology, page dyad037.
- Prediction policy problems. Am. Econ. Rev., 105(5):491–495.
- Heterogeneous employment effects of job search programs. Journal of Human Resources, 57(2):597–636.
- Metalearners for estimating heterogeneous treatment effects using machine learning. Proceedings of the national academy of sciences, 116(10):4156–4165.
- Mullainathan, Sendhil and Jann Spiess (2017). Machine learning: An applied econometric approach. J. Econ. Perspect., 31(2):87–106.
- Learning when-to-treat policies. Journal of the American Statistical Association, 116(533):392–409.
- Student success toolkit.
- Rzepakowski, Piotr and Szymon Jaroszewicz (2012). Decision trees for uplift modeling with single and multiple treatments. Knowledge and information systems, 32(2):303–327.
- Treatment Allocation under Uncertain Costs.
- Sunstein, Cass R and Richard Thaler (2008). Nudge. Yale University Press.
- Wager, Stefan and Susan Athey (2018). Estimation and inference of heterogeneous treatment effects using random forests. J. Am. Stat. Assoc., 113(523):1228–1242.
- Evaluating treatment prioritization rules via rank-weighted average treatment effects. arXiv preprint arXiv:2111.07966.
- Targeting for long-term outcomes. arXiv preprint arXiv:2010.15835.
- Zhang, Walter W and Sanjog Misra (2022). Coarse personalization. arXiv preprint arXiv:2204.05793.
- Effectively Selecting a Target Population for a Future Comparative Study. Journal of the American Statistical Association, 108(502):527–539.
- Offline multi-action policy learning: Generalization and optimization. Operations Research, 71(1):148–183.