Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
8 tokens/sec
Gemini 2.5 Pro Pro
47 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Demistifying Inference after Adaptive Experiments (2405.01281v1)

Published 2 May 2024 in stat.ME, econ.EM, math.ST, stat.ML, and stat.TH

Abstract: Adaptive experiments such as multi-arm bandits adapt the treatment-allocation policy and/or the decision to stop the experiment to the data observed so far. This has the potential to improve outcomes for study participants within the experiment, to improve the chance of identifying best treatments after the experiment, and to avoid wasting data. Seen as an experiment (rather than just a continually optimizing system) it is still desirable to draw statistical inferences with frequentist guarantees. The concentration inequalities and union bounds that generally underlie adaptive experimentation algorithms can yield overly conservative inferences, but at the same time the asymptotic normality we would usually appeal to in non-adaptive settings can be imperiled by adaptivity. In this article we aim to explain why, how, and when adaptivity is in fact an issue for inference and, when it is, understand the various ways to fix it: reweighting to stabilize variances and recover asymptotic normality, always-valid inference based on joint normality of an asymptotic limiting sequence, and characterizing and inverting the non-normal distributions induced by adaptivity.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (27)
  1. Karun Adusumilli. Optimal tests following sequential experiments. arXiv preprint arXiv:2305.00403, 2023.
  2. Contextual bandits in a survey experiment on charitable giving: Within-experiment outcomes versus policy learning. arXiv preprint arXiv:2211.12004, 2022.
  3. Post-contextual-bandit inference. Advances in neural information processing systems, 34:28548–28559, 2021a.
  4. Risk minimization from adaptively collected data: Guarantees for supervised and policy learning. Advances in neural information processing systems, 34:19261–19273, 2021b.
  5. Near-optimal non-parametric sequential tests and confidence sequences with possibly dependent observations. arXiv preprint arXiv:2212.14411, 2022.
  6. An adaptive targeted field experiment: Job search assistance for refugees in jordan. Journal of the European Economic Association, page jvad067, 2023.
  7. Applied causal inference powered by ML and AI. 2024.
  8. Price subsidies, diagnostic tests, and targeting of malaria treatment: evidence from a randomized controlled trial. American Economic Review, 105(2):609–645, 2015.
  9. Optimal best arm identification with fixed confidence. In Conference on Learning Theory, pages 998–1027. PMLR, 2016.
  10. Confidence intervals for policy evaluation in adaptive experiments. Proceedings of the national academy of sciences, 118(15):e2014602118, 2021.
  11. Martingale limit theory and its application. Academic press, 2014.
  12. Asymptotic representations for sequential decisions, adaptive experiments, and batched bandits. arXiv preprint arXiv:2302.03117, 2023.
  13. Time-uniform, nonparametric, nonasymptotic confidence sequences. The Annals of Statistics, 49(2):1055 – 1080, 2021.
  14. Dynamic assortment personalization in high dimensions. Operations Research, 68(4):1020–1037, 2020.
  15. Robert W Keener. Theoretical statistics: Topics for a core course. Springer Science & Business Media, 2010.
  16. Bandit algorithms. Cambridge University Press, 2020.
  17. A contextual-bandit approach to personalized news article recommendation. In Proceedings of the 19th international conference on World wide web, pages 661–670, 2010.
  18. Alexander R Luedtke and Mark J Van Der Laan. Statistical inference for the mean outcome under a possibly non-unique optimal treatment strategy. Annals of statistics, 44(2):713, 2016.
  19. Optimal policies to battle the coronavirus ”infodemic” among social media users in sub-saharan africa: Pre-analysis plan. Technical report, 2020.
  20. Dynamic pricing with demand covariates. arXiv preprint arXiv:1604.07463, 2016.
  21. Bootstrap methods for adaptive designs. Statistics in medicine, 18(14):1757–1767, 1999.
  22. Aad W Van der Vaart. Asymptotic statistics, volume 3. Cambridge university press, 2000.
  23. Time-uniform central limit theory and asymptotic confidence sequences. arXiv preprint arXiv:2103.06476, 2021.
  24. Statistical inference with data-dependent treatment allocation rules. Journal of the American Statistical Association, 85(409):156–162, 1990.
  25. Off-policy evaluation via adaptive weighting with data from contextual bandits. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, pages 2125–2135, 2021.
  26. Inference for batched bandits. In H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin, editors, Advances in Neural Information Processing Systems, volume 33, pages 9818–9829. Curran Associates, Inc., 2020.
  27. Statistical inference with m-estimators on adaptively collected data. Advances in neural information processing systems, 34:7460–7471, 2021.
Citations (1)

Summary

We haven't generated a summary for this paper yet.