Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
156 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Near-Optimal Non-Parametric Sequential Tests and Confidence Sequences with Possibly Dependent Observations (2212.14411v5)

Published 29 Dec 2022 in stat.ME, econ.EM, math.ST, stat.ML, and stat.TH

Abstract: Sequential tests and their implied confidence sequences, which are valid at arbitrary stopping times, promise flexible statistical inference and on-the-fly decision making. However, strong guarantees are limited to parametric sequential tests that under-cover in practice or concentration-bound-based sequences that over-cover and have suboptimal rejection times. In this work, we consider classic delayed-start normal-mixture sequential probability ratio tests, and we provide the first asymptotic type-I-error and expected-rejection-time guarantees under general non-parametric data generating processes, where the asymptotics are indexed by the test's burn-in time. The type-I-error results primarily leverage a martingale strong invariance principle and establish that these tests (and their implied confidence sequences) have type-I error rates asymptotically equivalent to the desired (possibly varying) $\alpha$-level. The expected-rejection-time results primarily leverage an identity inspired by It^o's lemma and imply that, in certain asymptotic regimes, the expected rejection time is asymptotically equivalent to the minimum possible among $\alpha$-level tests. We show how to apply our results to sequential inference on parameters defined by estimating equations, such as average treatment effects. Together, our results establish these (ostensibly parametric) tests as general-purpose, non-parametric, and near-optimal. We illustrate this via numerical simulations and a real-data application to A/B testing at Netflix.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (76)
  1. Jordanka A Angelova. On moments of sample mean and variance. Int. J. Pure Appl. Math, 79(1):67–85, 2012.
  2. F. J. Anscombe. Fixed-sample-size analysis of sequential observations. Biometrics, 10(1):89–100, 1954.
  3. P. Armitage. Interim analyses in clinical trials. In F. M. Hoppe, editor, Multiple Comparisons, Selection and Applications in Biometry, pages 392–393. CRC Press, 1993.
  4. Repeated significance tests on accumulating data. Journal of the Royal Statistical Society. Series A (General), 132(2):235–244, 1969.
  5. The miracle of microfinance? evidence from a randomized evaluation. American economic journal: Applied economics, 7(1):22–53, 2015.
  6. James O. Berger. Statistical decision theory and Bayesian analysis. Springer-Verlag, New York, 1985.
  7. A unified conditional frequentist and bayesian test for fixed and sequential simple hypothesis testing. Ann. Statist., 22(4):1787–1807, 12 1994.
  8. Simultaneous bayesian-frequentist sequential testing of nested hypotheses. Biometrika, 86(1):79–92, 1999.
  9. Post-contextual-bandit inference. Advances in Neural Information Processing Systems, 34:28548–28559, 2021a.
  10. Risk minimization from adaptively collected data: Guarantees for supervised and policy learning. Advances in Neural Information Processing Systems, 34:19261–19273, 2021b.
  11. Sequential causal inference in a single world of connected units, 2021c.
  12. Double/debiased machine learning for treatment and structural parameters. The Econometrics Journal, 21(1):C1–C68, 01 2018.
  13. Confidence sequences for mean, variance, and median. Proceedings of the National Academy of Sciences, 58(1):66–68, 1967.
  14. Why optional stopping can be a problem for bayesians. Psychonomic Bulletin & Review, 28(3):795–812, 2021.
  15. Sequential Decision Problems for Processes with Continuous time Parameter. Testing Hypotheses. The Annals of Mathematical Statistics, 24(2):254 – 264, 1953.
  16. Roger H Farrell. Asymptotic behavior of expected sample size in certain one sided tests. The Annals of Mathematical Statistics, pages 36–72, 1964.
  17. Safe testing. In Information Theory and Applications Workshop, pages 1–54, 2020.
  18. On the Expansion for Expected Sample Size in Non-Linear Renewal Theory. The Annals of Probability, 10(3):844–848, 1982.
  19. P. Hall and C.C. Heyde. Martingale Limit Theory and its Application. Probability and Mathematical Statistics: A Series of Monographs and Textbooks. Academic Press, 1980.
  20. Optional stopping with bayes factors: A categorization and extension of folklore results, with an application to invariant situations. Bayesian Analysis, 16(3):961 – 989, 2021.
  21. Wassily Hoeffding. A Class of Statistics with Asymptotically Normal Distribution. The Annals of Mathematical Statistics, 19(3):293 – 325, 1948.
  22. Sequential estimation of quantiles with applications to A/B testing and best-arm identification. Bernoulli, 28(3):1704 – 1728, 2022.
  23. Time-uniform, nonparametric, nonasymptotic confidence sequences. The Annals of Statistics, 49(2):1055 – 1080, 2021.
  24. Strong laws of large numbers for arrays of rowwise independent random variables. Acta Mathematica Hungarica, 54(1-2):153–162, 1989.
  25. Harold Jeffreys. Some tests of significance, treated by the theory of probability. Mathematical Proceedings of the Cambridge Philosophical Society, 31(2):203–222, 1935.
  26. C. Jennison and B.W. Turnbull. Group Sequential Methods with Applications to Clinical Trials. CRC Press, 1999.
  27. Repeated confidence intervals for group sequential clinical trials. Controlled Clinical Trials, 5(1):33–45, 1984.
  28. Interim analyses: The repeated confidence interval approach. Journal of the Royal Statistical Society. Series B (Methodological), 51(3):305–361, 1989.
  29. Peeking at a/b tests: Why it matters, and what to do about it. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, page 1517–1525, 2017.
  30. Always valid inference: Continuous monitoring of a/b tests. Operations Research, 70(3):1806–1821, 2022.
  31. Bayes factors. Journal of the American Statistical Association, 90(430):773–795, 1995.
  32. An approximation of partial sums of independent rv’-s, and the sample df. i. Zeitschrift für Wahrscheinlichkeitstheorie und Verwandte Gebiete, 32(1):111–131, 1975.
  33. An approximation of partial sums of independent rv’s, and the sample df. ii. Zeitschrift für Wahrscheinlichkeitstheorie und Verwandte Gebiete, 34(1):33–58, 1976.
  34. T. L. Lai and D. Siegmund. A Nonlinear Renewal Theory with Applications to Sequential Analysis I. The Annals of Statistics, 5(5):946 – 954, 1977.
  35. T. L. Lai and D. Siegmund. A Nonlinear Renewal Theory with Applications to Sequential Analysis II. The Annals of Statistics, 7(1):60 – 76, 1979.
  36. Tze Leung Lai. Asymptotic optimality of invariant sequential probability ratio tests. The Annals of Statistics, 9(2):318–333, 1981.
  37. Discrete sequential boundaries for clinical trials. Biometrika, 70(3):659–663, 1983.
  38. Testing statistical hypotheses. Springer, New York, third edition, 2005.
  39. Theory of Point Estimation. Springer-Verlag, New York, NY, USA, second edition, 1998.
  40. Anytime-valid inference for multinomial count data. In Advances in Neural Information Processing Systems, 2022. URL https://openreview.net/forum?id=a4zg0jiuVi.
  41. Rapid regression detection in software deployments through sequential testing. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, page 3336–3346, 2022.
  42. Nonanticipating estimation applied to sequential analysis and changepoint detection. The Annals of Statistics, 33(3):1422 – 1454, 2005.
  43. Donald L McLeish. Dependent central limit theorems and invariance principles. the Annals of Probability, 2(4):620–628, 1974.
  44. Ix. on the problem of the most efficient tests of statistical hypotheses. Philosophical Transactions of the Royal Society of London. Series A, Containing Papers of a Mathematical or Physical Character, 231(694-706):289–337, 1933.
  45. A multiple testing procedure for clinical trials. Biometrics, 35(3):549–556, 1979.
  46. Stuart J. Pocock. Group sequential methods in the design and analysis of clinical trials. Biometrika, 64(2):191–199, 1977.
  47. Admissible anytime-valid sequential inference must rely on nonnegative martingales, 2020.
  48. Game-theoretic statistics and safe anytime-valid inference, 2022a.
  49. Testing exchangeability: Fork-convexity, supermartingales and e-processes. International Journal of Approximate Reasoning, 141:83–109, 2022b.
  50. H. Robbins and D. Siegmund. The Expected Sample Size of Some Tests of Power One. The Annals of Statistics, 2(3):415 – 436, 1974. doi: 10.1214/aos/1176342704. URL https://doi.org/10.1214/aos/1176342704.
  51. Herbert Robbins. Statistical methods related to the law of the iterated logarithm. The Annals of Mathematical Statistics, 41(5):1397–1409, 1970.
  52. Boundary crossing probabilities for the wiener process and sample sums. The Annals of Mathematical Statistics, pages 1410–1429, 1970.
  53. A class of stopping rules for testing parametric hypotheses. In Proc. Sixth Berkeley Symp. Math. Statist. Probab, volume 4, pages 37–41, 1972.
  54. A composite generalization of ville’s martingale theorem, 2022.
  55. Consort 2010 statement: updated guidelines for reporting parallel group randomised trials. BMJ, 340, 2010.
  56. Test martingales, bayes factors and p-values. Statistical Science, 26(1):84–101, 2011.
  57. E-detectors: a nonparametric framework for online changepoint detection, 2022.
  58. D. Siegmund. Error probabilities and average sample number of the sequential probability ratio test. Journal of the Royal Statistical Society. Series B (Methodological), 37(3):394–401, 1975.
  59. D. Siegmund. Sequential Analysis: Tests and Confidence Intervals. Springer Series in Statistics. Springer New York, 2013.
  60. Volker Strassen. An invariance principle for the law of the iterated logarithm. Zeitschrift für Wahrscheinlichkeitstheorie und verwandte Gebiete, 3(3):211–226, 1964.
  61. Volker Strassen. Almost sure behavior of sums of independent random variables and martingales. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, volume 3, page 315. Univ of California Press, 1967.
  62. The safe logrank test: Error control under continuous monitoring with unlimited horizon, 2020.
  63. Netflix recommendations: Beyond the 5 stars (part 1). Netflix Tech Blog, 2021.
  64. Sara A Van de Geer. Empirical Processes in M-estimation, volume 6. Cambridge University Press, 2000.
  65. Jean Ville. Étude critique de la notion de collectif. 1939.
  66. Martin J. Wainwright. Basic tail and concentration bounds, page 21–57. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, 2019a.
  67. Martin J Wainwright. High-dimensional statistics: A non-asymptotic viewpoint. Cambridge University Press, 2019b.
  68. A. Wald. Sequential tests of statistical hypotheses. Ann. Math. Statist., 16(2):117–186, 06 1945.
  69. A. Wald. Sequential analysis. J. Wiley & Sons, Incorporated, 1947.
  70. A. Wald and J. Wolfowitz. Optimum character of the sequential probability ratio test. The Annals of Mathematical Statistics, 19(3):326–339, 1948.
  71. A. Wald and J. Wolfowitz. Bayes Solutions of Sequential Decision Problems. The Annals of Mathematical Statistics, 21(1):82 – 99, 1950.
  72. Universal inference. Proceedings of the National Academy of Sciences, 117(29):16880–16890, 2020.
  73. Time-uniform central limit theory, asymptotic confidence sequences, and anytime-valid causal inference, 2021.
  74. Michael Woodroofe. A Renewal Theorem for Curved Boundaries and Moments of First Passage Times. The Annals of Probability, 4(1):67 – 80, 1976.
  75. Michael Woodroofe. Nonlinear Renewal Theory in Sequential Analysis. Society for Industrial and Applied Mathematics, 1982.
  76. Cun-Hui Zhang. A Nonlinear Renewal Theory. The Annals of Probability, 16(2):793 – 824, 1988.
Citations (9)

Summary

We haven't generated a summary for this paper yet.