Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
143 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Game-theoretic statistics and safe anytime-valid inference (2210.01948v2)

Published 4 Oct 2022 in math.ST, cs.GT, cs.IT, math.IT, stat.ME, and stat.TH

Abstract: Safe anytime-valid inference (SAVI) provides measures of statistical evidence and certainty -- e-processes for testing and confidence sequences for estimation -- that remain valid at all stopping times, accommodating continuous monitoring and analysis of accumulating data and optional stopping or continuation for any reason. These measures crucially rely on test martingales, which are nonnegative martingales starting at one. Since a test martingale is the wealth process of a player in a betting game, SAVI centrally employs game-theoretic intuition, language and mathematics. We summarize the SAVI goals and philosophy, and report recent advances in testing composite hypotheses and estimating functionals in nonparametric settings.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (89)
  1. {barticle}[author] \bauthor\bsnmAbbasi-Yadkori, \bfnmYasin\binitsY., \bauthor\bsnmPál, \bfnmDávid\binitsD. and \bauthor\bsnmSzepesvári, \bfnmCsaba\binitsC. (\byear2011). \btitleImproved algorithms for linear stochastic bandits. \bjournalAdvances in Neural Information Processing Systems \bvolume24. \endbibitem
  2. {barticle}[author] \bauthor\bsnmAnscombe, \bfnmFrancis J.\binitsF. J. (\byear1954). \btitleFixed-sample size analysis of sequential observations. \bjournalBiometrics \bvolume10 \bpages89–100. \endbibitem
  3. {barticle}[author] \bauthor\bsnmBarnard, \bfnmGeorge A.\binitsG. A. (\byear1947). \btitleReview of Abraham Wald’s Sequential Analysis. \bjournalJournal of the American Statistical Association \bvolume42 \bpages658–665. \endbibitem
  4. {barticle}[author] \bauthor\bsnmBarron, \bfnmA.\binitsA., \bauthor\bsnmRissanen, \bfnmJ.\binitsJ. and \bauthor\bsnmYu, \bfnmB.\binitsB. (\byear1998). \btitleThe Minimum Description Length principle in coding and modeling. \bjournalIEEE transactions on information theory \bvolume44 \bpages2743-2760. \bnoteSpecial Commemorative Issue: Information Theory: 1948-1998. \endbibitem
  5. {barticle}[author] \bauthor\bsnmBenjamini, \bfnmYoav\binitsY. and \bauthor\bsnmHochberg, \bfnmYosef\binitsY. (\byear1995). \btitleControlling the false discovery rate: a practical and powerful approach to multiple testing. \bjournalJournal of the Royal Statistical Society: Series B \bvolume57 \bpages289–300. \endbibitem
  6. {barticle}[author] \bauthor\bsnmBenjamini, \bfnmYoav\binitsY. and \bauthor\bsnmYekutieli, \bfnmDaniel\binitsD. (\byear2001). \btitleThe control of the false discovery rate in multiple testing under dependency. \bjournalAnnals of Statistics \bpages1165–1188. \endbibitem
  7. {barticle}[author] \bauthor\bsnmBenjamini, \bfnmYoav\binitsY. and \bauthor\bsnmYekutieli, \bfnmDaniel\binitsD. (\byear2005). \btitleFalse discovery rate–adjusted multiple confidence intervals for selected parameters. \bjournalJournal of the American Statistical Association \bvolume100 \bpages71–81. \endbibitem
  8. {barticle}[author] \bauthor\bsnmBreiman, \bfnmLeo\binitsL. (\byear1961). \btitleOptimal gambling systems for favorable games. \bjournalFourth Berkeley Symposium. \endbibitem
  9. {bmisc}[author] \bauthor\bsnmCarney, \bfnmD. R.\binitsD. R. \btitleMy position on “Power Poses”. \bnoteAccessed 5 June 2022, Web link. \endbibitem
  10. {barticle}[author] \bauthor\bsnmCarney, \bfnmD. R.\binitsD. R., \bauthor\bsnmCuddy, \bfnmA. J. C.\binitsA. J. C. and \bauthor\bsnmYap, \bfnmA. J.\binitsA. J. (\byear2010). \btitlePower posing: Brief nonverbal displays cause changes in neuroendocrine levels and risk tolerance. \bjournalPsychological Science \bvolume21 \bpages1363–1368. \endbibitem
  11. {barticle}[author] \bauthor\bsnmCasgrain, \bfnmPhilippe\binitsP., \bauthor\bsnmLarsson, \bfnmMartin\binitsM. and \bauthor\bsnmZiegel, \bfnmJohanna\binitsJ. (\byear2022). \btitleAnytime-valid sequential testing for elicitable functionals via supermartingales. \bjournalarXiv:2204.05680. \endbibitem
  12. {binproceedings}[author] \bauthor\bsnmCatoni, \bfnmOlivier\binitsO. (\byear2012). \btitleChallenging the empirical mean and empirical variance: a deviation study. In \bbooktitleAnnales de l’IHP Probabilités et statistiques \bvolume48 \bpages1148–1185. \endbibitem
  13. {barticle}[author] \bauthor\bsnmChoe, \bfnmYo Joong\binitsY. J. and \bauthor\bsnmRamdas, \bfnmAaditya\binitsA. (\byear2021). \btitleComparing Sequential Forecasters. \bjournalarXiv:2110.00115. \endbibitem
  14. {binproceedings}[author] \bauthor\bsnmChowdhury, \bfnmSayak Ray\binitsS. R. and \bauthor\bsnmGopalan, \bfnmAditya\binitsA. (\byear2017). \btitleOn kernelized multi-armed bandits. In \bbooktitleInternational Conference on Machine Learning \bpages844–853. \bpublisherPMLR. \endbibitem
  15. {barticle}[author] \bauthor\bsnmCover, \bfnmThomas M\binitsT. M. (\byear1974). \btitleUniversal gambling schemes and the complexity measures of Kolmogorov and Chaitin. \bjournalTechnical Report, no. 12. \endbibitem
  16. {barticle}[author] \bauthor\bsnmCox, \bfnmD. R.\binitsD. R. (\byear1952). \btitleSequential tests for composite hypotheses. \bjournalMathematical Proceedings of the Cambridge Philosophical Society \bvolume48 \bpages290–299. \endbibitem
  17. {bmisc}[author] \bauthor\bsnmCrane, \bfnmHarry\binitsH. and \bauthor\bsnmShafer, \bfnmGlenn\binitsG. (\byear2020). \btitleRisk is random: The magic of the d’Alembert. \bnoteWorking Paper #57 at www.probabilityandfinance.com. \endbibitem
  18. {barticle}[author] \bauthor\bsnmDarling, \bfnmDA\binitsD. and \bauthor\bsnmRobbins, \bfnmHerbert\binitsH. (\byear1968). \btitleSome nonparametric sequential tests with power one. \bjournalProceedings of the National Academy of Sciences \bvolume61 \bpages804–809. \endbibitem
  19. {barticle}[author] \bauthor\bsnmDe Heide, \bfnmRianne\binitsR. and \bauthor\bsnmGrünwald, \bfnmPeter D\binitsP. D. (\byear2021). \btitleWhy optional stopping can be a problem for Bayesians. \bjournalPsychonomic Bulletin & Review \bvolume28 \bpages795–812. \endbibitem
  20. {barticle}[author] \bauthor\bparticlede la \bsnmPeña, \bfnmVictor H.\binitsV. H. (\byear1999). \btitleA general class of exponential inequalities for martingales and ratios. \bjournalAnnals of Probability \bvolume27 \bpages537–564. \endbibitem
  21. {barticle}[author] \bauthor\bsnmDelyon, \bfnmBernard\binitsB. (\byear2009). \btitleExponential inequalities for sums of weakly dependent variables. \bjournalElectronic Journal of Probability \bvolume14 \bpages752–779. \endbibitem
  22. {bmisc}[author] \bauthor\bsnmDimitrov, \bfnmValentin\binitsV., \bauthor\bsnmShafer, \bfnmGlenn\binitsG. and \bauthor\bsnmZhang, \bfnmTiangang\binitsT. (\byear2022). \btitleThe martingale index. \bnoteWorking Paper #61 at www.probabilityandfinance.com. \endbibitem
  23. {binproceedings}[author] \bauthor\bsnmDuan, \bfnmBoyan\binitsB., \bauthor\bsnmRamdas, \bfnmAaditya\binitsA. and \bauthor\bsnmWasserman, \bfnmLarry\binitsL. (\byear2022). \btitleInteractive rank testing by betting. In \bbooktitleFirst conference on Causal Learning and Reasoning. \bpublisherPMLR. \endbibitem
  24. {barticle}[author] \bauthor\bsnmDubins, \bfnmLester E\binitsL. E. and \bauthor\bsnmSavage, \bfnmLeonard J\binitsL. J. (\byear1965). \btitleA Tchebycheff-like inequality for stochastic processes. \bjournalProceedings of the National Academy of Sciences \bvolume53 \bpages274–275. \endbibitem
  25. {bbook}[author] \bauthor\bsnmEdwards, \bfnmA. W. F.\binitsA. W. F. (\byear1992). \btitleLikelihood. \bpublisherJohns Hopkins University Press. \endbibitem
  26. {barticle}[author] \bauthor\bsnmEfron, \bfnmBradley\binitsB. (\byear1969). \btitleStudent’s t-test under symmetry conditions. \bjournalJournal of the American Statistical Association \bvolume64 \bpages1278–1302. \endbibitem
  27. {barticle}[author] \bauthor\bsnmFan, \bfnmXiequan\binitsX., \bauthor\bsnmGrama, \bfnmIon\binitsI. and \bauthor\bsnmLiu, \bfnmQuansheng\binitsQ. (\byear2015). \btitleExponential inequalities for martingales with applications. \bjournalElectronic Journal of Probability \bvolume20 \bpages1–22. \endbibitem
  28. {barticle}[author] \bauthor\bsnmFeller, \bfnmWilly K.\binitsW. K. (\byear1940). \btitleStatistical aspects of ESP. \bjournalThe Journal of Parapsychology \bvolume4 \bpages271–298. \endbibitem
  29. {barticle}[author] \bauthor\bsnmGangrade, \bfnmAditya\binitsA., \bauthor\bsnmRinaldo, \bfnmAlessandro\binitsA. and \bauthor\bsnmRamdas, \bfnmAaditya\binitsA. (\byear2023). \btitleA Sequential Test for Log-Concavity. \bjournalarXiv preprint arXiv:2301.03542. \endbibitem
  30. {barticle}[author] \bauthor\bsnmGrünwald, \bfnmPeter\binitsP. (\byear2022). \btitleBeyond Neyman-Pearson. \bjournalarXiv:2205.00901. \endbibitem
  31. {barticle}[author] \bauthor\bsnmGrünwald, \bfnmPeter\binitsP. (\byear2023). \btitleThe E-Posterior. \bjournalPhilosophical Transactions of the Royal Society, Series A. \bdoi10.1098/rsta.2022.146 \endbibitem
  32. {barticle}[author] \bauthor\bsnmGrünwald, \bfnmP.\binitsP. and \bauthor\bsnmRoos, \bfnmT.\binitsT. (\byear2020). \btitleMinimum Description Length Revisited. \bjournalInternational Journal of Mathematics for Industry \bvolume11. \endbibitem
  33. {barticle}[author] \bauthor\bsnmGrünwald, \bfnmPeter\binitsP., \bauthor\bsnmHenzi, \bfnmAlexander\binitsA. and \bauthor\bsnmLardy, \bfnmTyron\binitsT. (\byear2023). \btitleAnytime-Valid Tests of Conditional Independence Under Model-X. \bjournalJournal of the American Statistical Association \bvolume0 \bpages1-12. \endbibitem
  34. {barticle}[author] \bauthor\bsnmHendriks, \bfnmHarrie\binitsH. (\byear2018). \btitleTest Martingales for bounded random variables. \bjournalarXiv:1801.09418. \endbibitem
  35. {barticle}[author] \bauthor\bsnmHenzi, \bfnmAlexander\binitsA., \bauthor\bsnmArnold, \bfnmSebastian\binitsS. and \bauthor\bsnmZiegel, \bfnmJohanna F\binitsJ. F. (\byear2023). \btitleSequentially valid tests for forecast calibration. \bjournalAnnals of Applied Statistics. \endbibitem
  36. {barticle}[author] \bauthor\bsnmHenzi, \bfnmAlexander\binitsA. and \bauthor\bsnmZiegel, \bfnmJohanna F.\binitsJ. F. (\byear2022). \btitleValid sequential inference on probability forecast performance. \bjournalBiometrika. \endbibitem
  37. {barticle}[author] \bauthor\bsnmHildreth, \bfnmClifford\binitsC. (\byear1963). \btitleBayesian statisticians and remote clients. \bjournalEconometrica: Journal of the Econometric Society \bpages422–438. \endbibitem
  38. {barticle}[author] \bauthor\bsnmHoward, \bfnmSteven R\binitsS. R. and \bauthor\bsnmRamdas, \bfnmAaditya\binitsA. (\byear2022). \btitleSequential estimation of quantiles with applications to A/B testing and best-arm identification. \bjournalBernoulli \bvolume28 \bpages1704–1728. \endbibitem
  39. {barticle}[author] \bauthor\bsnmIgnatiadis, \bfnmNikolaos\binitsN., \bauthor\bsnmWang, \bfnmRuodu\binitsR. and \bauthor\bsnmRamdas, \bfnmAaditya\binitsA. (\byear2022). \btitleE-values as unnormalized weights in multiple testing. \bjournalarXiv:2204.12447. \endbibitem
  40. {barticle}[author] \bauthor\bsnmJohn, \bfnmLeslie K\binitsL. K., \bauthor\bsnmLoewenstein, \bfnmGeorge\binitsG. and \bauthor\bsnmPrelec, \bfnmDrazen\binitsD. (\byear2012). \btitleMeasuring the prevalence of questionable research practices with incentives for truth telling. \bjournalPsychological science \bvolume23 \bpages524–532. \endbibitem
  41. {binproceedings}[author] \bauthor\bsnmKarampatziakis, \bfnmNikos\binitsN., \bauthor\bsnmMineiro, \bfnmPaul\binitsP. and \bauthor\bsnmRamdas, \bfnmAaditya\binitsA. (\byear2021). \btitleOff-policy confidence sequences. In \bbooktitleInternational Conference on Machine Learning \bpages5301–5310. \bpublisherPMLR. \endbibitem
  42. {barticle}[author] \bauthor\bsnmKaufmann, \bfnmEmilie\binitsE. and \bauthor\bsnmKoolen, \bfnmWouter M\binitsW. M. (\byear2021). \btitleMixture Martingales Revisited with Applications to Sequential Tests and Confidence Intervals. \bjournalJ. Mach. Learn. Res. \bvolume22 \bpages246–1. \endbibitem
  43. {barticle}[author] \bauthor\bsnmKelly, \bfnmJ. L.\binitsJ. L. (\byear1956). \btitleA New Interpretation of Information Rate. \bjournalBell System Technical Journal \bpages917–926. \endbibitem
  44. {barticle}[author] \bauthor\bsnmLai, \bfnmTze Leung\binitsT. L. (\byear1976). \btitleOn confidence sequences. \bjournalAnn. Statist. \bvolume4 \bpages265–280. \endbibitem
  45. {barticle}[author] \bauthor\bsnmLhéritier, \bfnmAlix\binitsA. and \bauthor\bsnmCazals, \bfnmFrédéric\binitsF. (\byear2018). \btitleA sequential non-parametric multivariate two-sample test. \bjournalIEEE Transactions on Information Theory \bvolume64 \bpages3361–3370. \endbibitem
  46. {binproceedings}[author] \bauthor\bsnmLi, \bfnmJ. Q.\binitsJ. Q. and \bauthor\bsnmBarron, \bfnmA. R.\binitsA. R. (\byear2000). \btitleMixture Density Estimation. In \bbooktitleAdvances in Neural Information Processing Systems \bvolume12 \bpages279–285. \endbibitem
  47. {barticle}[author] \bauthor\bsnmMaclean, \bfnmLeonard C.\binitsL. C., \bauthor\bsnmThorp, \bfnmEdward O.\binitsE. O. and \bauthor\bsnmZiemba, \bfnmWilliam T.\binitsW. T. (\byear2010). \btitleLong-term capital growth: the good and bad properties of the Kelly and fractional Kelly capital growth criteria. \bjournalQuantitative Finance \bvolume10 \bpages681-687. \endbibitem
  48. {barticle}[author] \bauthor\bsnmManole, \bfnmTudor\binitsT. and \bauthor\bsnmRamdas, \bfnmAaditya\binitsA. (\byear2023). \btitleSequential estimation of convex divergences using reverse submartingales and exchangeable filtrations. \bjournalIEEE Transactions on Information Theory. \endbibitem
  49. {barticle}[author] \bauthor\bsnmMingxiu, \bfnmHu\binitsH., \bauthor\bsnmCappelleri, \bfnmJoseph C.\binitsJ. C. and \bauthor\bsnmGordon Lan, \bfnmK. K.\binitsK. K. (\byear2007). \btitleApplying the law of iterated logarithm to control type I error in cumulative meta-analysis of binary outcomes. \bjournalClinical Trials. \endbibitem
  50. {binproceedings}[author] \bauthor\bsnmNeiswanger, \bfnmWillie\binitsW. and \bauthor\bsnmRamdas, \bfnmAaditya\binitsA. (\byear2021). \btitleUncertainty quantification using martingales for misspecified Gaussian processes. In \bbooktitleAlgorithmic Learning Theory \bpages963–982. \bpublisherPMLR. \endbibitem
  51. {barticle}[author] \bauthor\bsnmOrabona, \bfnmFrancesco\binitsF. and \bauthor\bsnmJun, \bfnmKwang-Sung\binitsK.-S. (\byear2021). \btitleTight concentrations and confidence sequences from the regret of universal portfolio. \bjournalarXiv:2110.14099. \endbibitem
  52. {barticle}[author] \bauthor\bsnmPace, \bfnmLuigi\binitsL. and \bauthor\bsnmSalvan, \bfnmAlessandra\binitsA. (\byear2020). \btitleLikelihood, Replicability and Robbins’ Confidence Sequences. \bjournalInternational Statistical Review \bvolume88 \bpages599–615. \endbibitem
  53. {barticle}[author] \bauthor\bsnmPawel, \bfnmSamuel\binitsS., \bauthor\bsnmLy, \bfnmAlexander\binitsA. and \bauthor\bsnmWagenmakers, \bfnmEric-Jan\binitsE.-J. (\byear2022). \btitleEvidential Calibration of Confidence Intervals. \bjournalarXiv:2206.12290. \endbibitem
  54. {barticle}[author] \bauthor\bsnmRamdas, \bfnmAaditya\binitsA. and \bauthor\bsnmManole, \bfnmTudor\binitsT. (\byear2023). \btitleRandomized and Exchangeable Improvements of Markov’s, Chebyshev’s and Chernoff’s Inequalities. \bjournalarXiv preprint arXiv:2304.02611. \endbibitem
  55. {barticle}[author] \bauthor\bsnmRobbins, \bfnmHerbert\binitsH. (\byear1952). \btitleSome aspects of the sequential design of experiments. \bjournalBulletin of the American Mathematical Society \bvolume58 \bpages527–535. \endbibitem
  56. {barticle}[author] \bauthor\bsnmRobbins, \bfnmHerbert\binitsH. (\byear1970). \btitleStatistical methods related to the law of the iterated logarithm. \bjournalThe Annals of Mathematical Statistics \bvolume41 \bpages1397–1409. \endbibitem
  57. {barticle}[author] \bauthor\bsnmRobbins, \bfnmHerbert\binitsH. and \bauthor\bsnmSiegmund, \bfnmDavid\binitsD. (\byear1974). \btitleThe expected sample size of some tests of power one. \bjournalThe Annals of Statistics \bvolume2 \bpages415–436. \endbibitem
  58. {bbook}[author] \bauthor\bsnmRoyall, \bfnmRichard\binitsR. (\byear1997). \btitleStatistical evidence: a likelihood paradigm. \bpublisherChapman and Hall. \endbibitem
  59. {barticle}[author] \bauthor\bsnmRushton, \bfnmS.\binitsS. (\byear1950). \btitleOn a Sequential t-test. \bjournalbiometrika \bvolume37 \bpages326–333. \endbibitem
  60. {binproceedings}[author] \bauthor\bsnmShaer, \bfnmShalev\binitsS., \bauthor\bsnmMaman, \bfnmGal\binitsG. and \bauthor\bsnmRomano, \bfnmYaniv\binitsY. (\byear2023). \btitleModel-Free Sequential Testing for Conditional Independence via Testing by Betting. In \bbooktitleInternational Conference on Artificial Intelligence and Statistics. \endbibitem
  61. {barticle}[author] \bauthor\bsnmShafer, \bfnmGlenn\binitsG. (\byear2021). \btitleTesting by betting: a strategy for statistical and scientific communication (with discussion and response). \bjournalJournal of the Royal Statistic Society A \bvolume184 \bpages407–478. \endbibitem
  62. {bbook}[author] \bauthor\bsnmShafer, \bfnmGlenn\binitsG. and \bauthor\bsnmVovk, \bfnmVladimir\binitsV. (\byear2001). \btitleProbability and Finance: It’s Only a Game. \bpublisherWiley, \baddressNew York. \endbibitem
  63. {barticle}[author] \bauthor\bsnmShekhar, \bfnmShubhanshu\binitsS. and \bauthor\bsnmRamdas, \bfnmAaditya\binitsA. (\byear2021). \btitleNonparametric two sample testing by betting. \bjournalarXiv:2112.09162. \endbibitem
  64. {binproceedings}[author] \bauthor\bsnmShekhar, \bfnmShubhanshu\binitsS. and \bauthor\bsnmRamdas, \bfnmAaditya\binitsA. (\byear2023). \btitleSequential change detection via backward confidence sequences. In \bbooktitleInternational Conference on Machine Learning. \endbibitem
  65. {barticle}[author] \bauthor\bsnmShin, \bfnmJaehyeok\binitsJ., \bauthor\bsnmRamdas, \bfnmAaditya\binitsA. and \bauthor\bsnmRinaldo, \bfnmAlessandro\binitsA. (\byear2022). \btitleE-detectors: a nonparametric framework for online changepoint detection. \bjournalarXiv:2203.03532. \endbibitem
  66. {binproceedings}[author] \bauthor\bsnmSpertus, \bfnmJacob V\binitsJ. V. and \bauthor\bsnmStark, \bfnmPhillip B\binitsP. B. (\byear2022). \btitleSweeter than SUITE: Supermartingale Stratified Union-Intersection Tests of Elections. In \bbooktitleInternational Joint Conference on Electronic Voting. \endbibitem
  67. {barticle}[author] \bauthor\bsnmTer Schure, \bfnmJ.\binitsJ. and \bauthor\bsnmGrünwald, \bfnmP.\binitsP. (\byear2022). \btitleALL-IN meta-analysis: breathing life into living systematic reviews. \bjournalF1000Research \bvolume11. \endbibitem
  68. {barticle}[author] \bauthor\bsnmTer Schure, \bfnmJudith\binitsJ., \bauthor\bsnmGrünwald, \bfnmPeter\binitsP. and \bauthor\bsnmLy, \bfnmAlexander\binitsA. (\byear2021). \btitlePandemic preparedness in data sharing; lessons learned from collaborating in a live meta-analysis. \bjournalStatOR \bvolume24 \bpages47–52. \endbibitem
  69. {bmisc}[author] \bauthor\bsnmTuring, \bfnmAlan M.\binitsA. M. (\byearc 1941). \btitleThe Applications of Probability to Cryptography. \bnoteUK National Archives, HW 25/37. See arXiv:1505.04714 for a version set in Latex. \endbibitem
  70. {barticle}[author] \bauthor\bsnmTurner, \bfnmRosanne\binitsR. and \bauthor\bsnmGrünwald, \bfnmPeter\binitsP. (\byear2023a). \btitleAnytime-valid Confidence Intervals for Contingency Tables and Beyond. \bjournalStatistics and Probability Letters. \endbibitem
  71. {binproceedings}[author] \bauthor\bsnmTurner, \bfnmRosanne\binitsR. and \bauthor\bsnmGrünwald, \bfnmPeter\binitsP. (\byear2023b). \btitleSafe Sequential Testing and Effect Estimation in Stratified Count Data. In \bbooktitleAnnual AI and Statistics Conference. \endbibitem
  72. {barticle}[author] \bauthor\bsnmTurner, \bfnmRosanne\binitsR., \bauthor\bsnmLy, \bfnmAlexander\binitsA. and \bauthor\bsnmGrünwald, \bfnmPeter\binitsP. (\byear2021). \btitleGeneric E-Variables for Exact Sequential k-Sample Tests that allow for Optional Stopping. \bjournalarXiv:2106.02693. \endbibitem
  73. {bbook}[author] \bauthor\bsnmVille, \bfnmJean\binitsJ. (\byear1939). \btitleEtude critique de la notion de collectif. \bpublisherGauthier-Villars. \endbibitem
  74. {barticle}[author] \bauthor\bsnmVovk, \bfnmVladimir\binitsV. (\byear2021). \btitleTesting randomness online. \bjournalStatistical Science \bvolume36 \bpages595–611. \endbibitem
  75. {barticle}[author] \bauthor\bsnmVovk, \bfnmVladimir\binitsV., \bauthor\bsnmNouretdinov, \bfnmIlia\binitsI. and \bauthor\bsnmGammerman, \bfnmAlex\binitsA. (\byear2021). \btitleConformal testing: binary case with Markov alternatives. \bjournalarXiv:2111.01885. \endbibitem
  76. {barticle}[author] \bauthor\bsnmWald, \bfnmAbraham\binitsA. (\byear1945). \btitleSequential Tests of Statistical Hypotheses. \bjournalAnn. Math. Statist. \bvolume16 \bpages117-186. \endbibitem
  77. {bbook}[author] \bauthor\bsnmWald, \bfnmAbraham\binitsA. (\byear1947a). \btitleSequential Analysis. \bpublisherWiley, \baddressNew York. \endbibitem
  78. {bbook}[author] \bauthor\bsnmWald, \bfnmAbraham\binitsA. (\byear1947b). \btitleSequential Analysis. \bpublisherWiley, \baddressNew York. \endbibitem
  79. {barticle}[author] \bauthor\bsnmWang, \bfnmRuodu\binitsR. and \bauthor\bsnmRamdas, \bfnmAaditya\binitsA. (\byear2022). \btitleFalse discovery rate control with e-values. \bjournalJournal of the Royal Statistical Society: Series B (Statistical Methodology). \endbibitem
  80. {barticle}[author] \bauthor\bsnmWang, \bfnmHongjian\binitsH. and \bauthor\bsnmRamdas, \bfnmAaditya\binitsA. (\byear2023a). \btitleThe extended Ville’s inequality for nonintegrable nonnegative supermartingales. \bjournalarXiv preprint arXiv:2304.01163. \endbibitem
  81. {barticle}[author] \bauthor\bsnmWang, \bfnmHongjian\binitsH. and \bauthor\bsnmRamdas, \bfnmAaditya\binitsA. (\byear2023b). \btitleCatoni-style confidence sequences for heavy-tailed mean estimation. \bjournalStochastic Processes and Applications. \endbibitem
  82. {barticle}[author] \bauthor\bsnmWang, \bfnmHongjian\binitsH. and \bauthor\bsnmRamdas, \bfnmAaditya\binitsA. (\byear2023c). \btitleHuber-robust confidence sequences. \bjournal26th International Conference on Artificial Intelligence and Statistics. \endbibitem
  83. {barticle}[author] \bauthor\bsnmWasserman, \bfnmLarry\binitsL., \bauthor\bsnmRamdas, \bfnmAaditya\binitsA. and \bauthor\bsnmBalakrishnan, \bfnmSivaraman\binitsS. (\byear2020). \btitleUniversal inference. \bjournalProceedings of the National Academy of Sciences \bvolume117 \bpages16880–16890. \endbibitem
  84. {barticle}[author] \bauthor\bsnmWaudby-Smith, \bfnmIan\binitsI. and \bauthor\bsnmRamdas, \bfnmAaditya\binitsA. (\byear2020). \btitleConfidence sequences for sampling without replacement. \bjournalAdvances in Neural Information Processing Systems \bvolume33 \bpages20204–20214. \endbibitem
  85. {barticle}[author] \bauthor\bsnmWaudby-Smith, \bfnmIan\binitsI. and \bauthor\bsnmRamdas, \bfnmAaditya\binitsA. (\byear2023). \btitleEstimating means of bounded random variables by betting. \bjournalJournal of the Royal Statistical Society: Series B (Statistical Methodology). \bnote(to appear with discussion). \endbibitem
  86. {binproceedings}[author] \bauthor\bsnmWaudby-Smith, \bfnmIan\binitsI., \bauthor\bsnmStark, \bfnmPhilip B\binitsP. B. and \bauthor\bsnmRamdas, \bfnmAaditya\binitsA. (\byear2021). \btitleRiLACS: Risk limiting audits via confidence sequences. In \bbooktitleInternational Joint Conference on Electronic Voting \bpages124–139. \bpublisherSpringer. \endbibitem
  87. {barticle}[author] \bauthor\bsnmXu, \bfnmZiyu\binitsZ., \bauthor\bsnmWang, \bfnmRuodu\binitsR. and \bauthor\bsnmRamdas, \bfnmAaditya\binitsA. (\byear2021). \btitleA unified framework for bandit multiple testing. \bjournalAdvances in Neural Information Processing Systems \bvolume34. \endbibitem
  88. {barticle}[author] \bauthor\bsnmXu, \bfnmZiyu\binitsZ., \bauthor\bsnmWang, \bfnmRuodu\binitsR. and \bauthor\bsnmRamdas, \bfnmAaditya\binitsA. (\byear2022). \btitlePost-selection inference for e-value based confidence intervals. \bjournalarXiv:2203.12572. \endbibitem
  89. {barticle}[author] \bauthor\bsnmZhang, \bfnmZhenyuan\binitsZ., \bauthor\bsnmRamdas, \bfnmAaditya\binitsA. and \bauthor\bsnmWang, \bfnmRuodu\binitsR. (\byear2023). \btitleWhen do exact and powerful p-values and e-values exist? \bjournalarXiv preprint arXiv:2305.16539. \endbibitem
Citations (101)

Summary

  • The paper introduces SAVI by applying test martingales to sequential hypothesis testing, ensuring valid statistical inference under arbitrary stopping times.
  • It integrates game theory with statistical methods to dynamically adapt sequential tests, overcoming the limitations of traditional p-value based approaches.
  • The framework is practical for real-time applications such as multi-armed bandits and continuous data streams, promising enhanced reliability and robustness.

Game-Theoretic Statistics and Safe Anytime-Valid Inference

The paper explores the field of statistical inference through the lens of game theory, specifically focusing on Safe Anytime-Valid Inference (SAVI). SAVI is characterized by statistical measures that remain valid under continuous observation and arbitrary stopping times. This is pivotal for truly flexible data analysis, especially in modern, data-rich environments.

Core Ideas

The central construct facilitating SAVI is the test martingale, a concept originating from probability theory. A test martingale represents a gambler's betting strategy in a game against nature, whose cumulative wealth informs about the veracity of a statistical hypothesis. The real novelty here, however, is the application of this concept to statistical inference, moving away from traditional approaches like p-values, which have limitations in sequential and monitoring contexts.

Theoretical Foundations

SAVI provides statistical evidence via test martingales that are structurally likened to buying a ticket by a gambler, expecting a return if the null hypothesis is false. This feature of allowing anytime validity—permitting checks and decisions at arbitrary points in data collection—circumvents the historical pitfalls of premature statistical inference, allowing for more robust and reliable conclusions.

Game-theoretically, SAVI recontextualizes inference as a sequence of bets, where the null hypothesis configures the odds. Here, a Sequential Test Family can be adapted dynamically, thus inherently embodying the flexibility to abide by the principles of good statistical practices.

Methodological Advances

The paper synthesizes a framework, combining concepts of e-processes – sequences similarly to e-values but in a dynamic setting. SAVI methods, by harnessing test martingales, are able to offer stronger guarantees for type-I error under optional stopping. This is crucial for practical scenarios in machine learning and real-time data streams, where the traditional assumptions of fixed-sample size do not hold.

Moreover, the authors emphasize the concept of Reverse Information Projection (RIPr) and Universal Inference (UI) for handling composite hypotheses, enriching traditional statistical inference with mechanisms that offer robustness across varying and composite data generating processes.

Practical and Theoretical Implications

SAVI, with its game-theoretic underpinning, maps comfortably onto settings like multi-armed bandits, AI, and meta-analyses, where continuous adaptation and evidence accumulation are necessary. Their methodological rigor and adaptability present potentially transformative tools for fields where data is viewed as a continuous stream rather than sporadic fixes in time.

Future Directions

The landscape of statistical testing is significantly broadened by SAVI, offering numerous future exploration avenues. These include the development of domain-specific e-processes and extending foundational game-theoretic concepts within AI-based adaptative systems. Further exploration into the interplay between e-processes and classical statistical approaches like p-values could usher a comprehensive synthesis in statistical theory, benefiting various practical applications.

In conclusion, this paper's approach circumvents traditional statistical boundaries, proposing a robust and dynamic framework where the accuracy of statistical tests is no longer compromised by the limitations of traditional methods—creating a paradigm shift towards more reliable and effective analytics in a continuously evolving data science landscape.

Youtube Logo Streamline Icon: https://streamlinehq.com