Optimal tests following sequential experiments (2305.00403v2)
Abstract: Recent years have seen tremendous advances in the theory and application of sequential experiments. While these experiments are not always designed with hypothesis testing in mind, researchers may still be interested in performing tests after the experiment is completed. The purpose of this paper is to aid in the development of optimal tests for sequential experiments by analyzing their asymptotic properties. Our key finding is that the asymptotic power function of any test can be matched by a test in a limit experiment where a Gaussian process is observed for each treatment, and inference is made for the drifts of these processes. This result has important implications, including a powerful sufficiency result: any candidate test only needs to rely on a fixed set of statistics, regardless of the type of sequential experiment. These statistics are the number of times each treatment has been sampled by the end of the experiment, along with final value of the score (for parametric models) or efficient influence function (for non-parametric models) process for each treatment. We then characterize asymptotically optimal tests under various restrictions such as unbiasedness, \alpha-spending constraints etc. Finally, we apply our our results to three key classes of sequential experiments: costly sampling, group sequential trials, and bandit experiments, and show how optimal inference can be conducted in these scenarios.
- K. Adusumilli, “Risk and optimal policies in bandit experiments,” arXiv preprint arXiv:2112.06363, 2021.
- ——, “How to sample and when to stop sampling: The generalized wald problem and minimax policies,” arXiv preprint arXiv:2210.15841, 2022.
- S. Athey, K. Bergstrom, V. Hadad, J. C. Jamison, B. Özler, L. Parisotto, and J. D. Sama, “Shared decision-making,” Development Research, 2021.
- S. Choi, W. J. Hall, and A. Schick, “Asymptotically uniformly most powerful tests in parametric and semiparametric models,” The Annals of Statistics, vol. 24, no. 2, pp. 841–861, 1996.
- L. Fan and P. W. Glynn, “Diffusion approximations for thompson sampling,” arXiv preprint arXiv:2105.09232, 2021.
- K. J. Ferreira, D. Simchi-Levi, and H. Wang, “Online network revenue management using thompson sampling,” Operations research, vol. 66, no. 6, pp. 1586–1602, 2018.
- D. Fudenberg, P. Strack, and T. Strzalecki, “Speed, accuracy, and the optimal timing of choices,” American Economic Review, vol. 108, no. 12, pp. 3651–84, 2018.
- K. Gordon Lan and D. L. DeMets, “Discrete sequential boundaries for clinical trials,” Biometrika, vol. 70, no. 3, pp. 659–663, 1983.
- P. Grünwald, R. de Heide, and W. M. Koolen, “Safe testing,” in 2020 Information Theory and Applications Workshop (ITA). IEEE, 2020, pp. 1–54.
- V. Hadad, D. A. Hirshberg, R. Zhan, S. Wager, and S. Athey, “Confidence intervals for policy evaluation in adaptive experiments,” Proceedings of the national academy of sciences, vol. 118, no. 15, p. e2014602118, 2021.
- W. J. Hall, “Analysis of sequential clinical trials,” Modern Clinical Trial Analysis, pp. 81–125, 2013.
- K. Hirano and J. R. Porter, “Asymptotic representations for sequential decisions, adaptive experiments, and batched bandits,” arXiv preprint arXiv:2302.03117, 2023.
- S. R. Howard, A. Ramdas, J. McAuliffe, and J. Sekhon, “Time-uniform, nonparametric, nonasymptotic confidence sequences,” The Annals of Statistics, vol. 49, no. 2, 2021.
- R. Johari, P. Koomen, L. Pekelis, and D. Walsh, “Always valid inference: Continuous monitoring of a/b tests,” Operations Research, vol. 70, no. 3, pp. 1806–1821, 2022.
- M. Kasy and A. Sautmann, “Adaptive treatment assignment in experiments for policy choice,” 2019.
- L. Le Cam, “A Reduction Theorem for Certain Sequential Experiments. II,” The Annals of Statistics, vol. 7, no. 4, pp. 847 – 859, 1979.
- P. C. O’Brien and T. R. Fleming, “A multiple testing procedure for clinical trials,” Biometrics, pp. 549–556, 1979.
- A. Ramdas, P. Grünwald, V. Vovk, and G. Shafer, “Game-theoretic statistics and safe anytime-valid inference,” arXiv preprint arXiv:2210.01948, 2022.
- D. Russo and B. Van Roy, “An information-theoretic analysis of thompson sampling,” The Journal of Machine Learning Research, vol. 17, no. 1, pp. 2442–2471, 2016.
- D. Russo, B. Van Roy, A. Kazerouni, I. Osband, and Z. Wen, “A tutorial on thompson sampling,” arXiv preprint arXiv:1707.02038, 2017.
- S. Wager and K. Xu, “Diffusion asymptotics for sequential experiments,” arXiv preprint arXiv:2101.09855, 2021.
- A. Wald, “Sequential analysis,” Tech. Rep., 1947.
- T. Zaks, “A phase 3, randomized, stratified, observer-blind, placebo-controlled study to evaluate the efficacy, safety, and immunogenicity of mrna-1273 sars-cov-2 vaccine in adults aged 18 years and older,” Protocol Number mRNA-1273-P301. ModernaTX (20 August 2020) https://www. modernatx. com/sites/default/files/mRNA-1273-P301-Protocol. pdf, 2020.
- K. Zhang, L. Janson, and S. Murphy, “Inference for batched bandits,” Advances in neural information processing systems, vol. 33, pp. 9818–9829, 2020.