A Robbins--Monro Sequence That Can Exploit Prior Information For Faster Convergence (2401.03206v1)
Abstract: We propose a new method to improve the convergence speed of the Robbins-Monro algorithm by introducing prior information about the target point into the Robbins-Monro iteration. We achieve the incorporation of prior information without the need of a -- potentially wrong -- regression model, which would also entail additional constraints. We show that this prior-information Robbins-Monro sequence is convergent for a wide range of prior distributions, even wrong ones, such as Gaussian, weighted sum of Gaussians, e.g., in a kernel density estimate, as well as bounded arbitrary distribution functions greater than zero. We furthermore analyse the sequence numerically to understand its performance and the influence of parameters. The results demonstrate that the prior-information Robbins-Monro sequence converges faster than the standard one, especially during the first steps, which are particularly important for applications where the number of function measurements is limited, and when the noise of observing the underlying function is large. We finally propose a rule to select the parameters of the sequence.
- {barticle}[author] \bauthor\bsnmAlspach, \bfnmDaniel\binitsD. and \bauthor\bsnmSorenson, \bfnmHarold\binitsH. (\byear1972). \btitleNonlinear Bayesian estimation using Gaussian sum approximations. \bjournalIEEE transactions on automatic control \bvolume17 \bpages439–448. \endbibitem
- {barticle}[author] \bauthor\bsnmBlum, \bfnmJulius R\binitsJ. R. (\byear1954). \btitleApproximation methods which converge with probability one. \bjournalThe Annals of Mathematical Statistics \bpages382–386. \endbibitem
- {barticle}[author] \bauthor\bsnmBlum, \bfnmJulius R\binitsJ. R. (\byear1954). \btitleMultidimensional stochastic approximation methods. \bjournalThe Annals of Mathematical Statistics \bpages737–744. \endbibitem
- {bbook}[author] \bauthor\bsnmBorkar, \bfnmVivek S\binitsV. S. (\byear2009). \btitleStochastic approximation: a dynamical systems viewpoint \bvolume48. \bpublisherSpringer. \endbibitem
- {barticle}[author] \bauthor\bsnmBottou, \bfnmLéon\binitsL., \bauthor\bsnmCurtis, \bfnmFrank E\binitsF. E. and \bauthor\bsnmNocedal, \bfnmJorge\binitsJ. (\byear2018). \btitleOptimization methods for large-scale machine learning. \bjournalSIAM review \bvolume60 \bpages223–311. \endbibitem
- {barticle}[author] \bauthor\bsnmChen, \bfnmYenChi\binitsY. (\byear2017). \btitleA tutorial on kernel density estimation and recent advances. \bjournalBiostatistics & Epidemiology \bvolume1 \bpages161–187. \endbibitem
- {barticle}[author] \bauthor\bsnmChung, \bfnmKaiLai\binitsK. (\byear1954). \btitleOn a stochastic approximation method. \bjournalThe Annals of Mathematical Statistics \bpages463–483. \endbibitem
- {binproceedings}[author] \bauthor\bsnmDriml, \bfnmMiloslav\binitsM. and \bauthor\bsnmNedoma, \bfnmJiří\binitsJ. (\byear1960). \btitleStochastic approximations for continuous random processes. In \bbooktitleTrans. of the second Prague conference on information theory \bpages145–148. \endbibitem
- {barticle}[author] \bauthor\bsnmEuler, \bfnmL\binitsL. (\byear1744). \btitleVariae observationes circa series infinitas. \bjournalCommentarii academiae scientiarum Petropolitanae \bvolume9 \bpages160–188. \endbibitem
- {barticle}[author] \bauthor\bsnmFabian, \bfnmVaclav\binitsV. (\byear1968). \btitleOn asymptotic normality in stochastic approximation. \bjournalThe Annals of Mathematical Statistics \bpages1327–1332. \endbibitem
- {barticle}[author] \bauthor\bsnmFarrell, \bfnmRH\binitsR. (\byear1962). \btitleBounded length confidence intervals for the zero of a regression function. \bjournalThe Annals of Mathematical Statistics \bpages237–247. \endbibitem
- {bbook}[author] \bauthor\bsnmFarrell, \bfnmRoger Hamlin\binitsR. H. (\byear1959). \btitleSequentially determined bounded length confidence intervals. \bpublisherUniversity of Illinois at Urbana-Champaign. \endbibitem
- {barticle}[author] \bauthor\bsnmGlynn, \bfnmPeter W\binitsP. W. and \bauthor\bsnmWhitt, \bfnmWard\binitsW. (\byear1992). \btitleThe asymptotic validity of sequential stopping rules for stochastic simulations. \bjournalThe Annals of Applied Probability \bvolume2 \bpages180–198. \endbibitem
- {barticle}[author] \bauthor\bsnmGötz, \bfnmS\binitsS., \bauthor\bsnmWhiting, \bfnmP\binitsP. and \bauthor\bsnmPeterchev, \bfnmA\binitsA. (\byear2011). \btitleThreshold estimation with transcranial magnetic stimulation: algorithm comparison. \bjournalClinical Neurophysiology \bvolume122 \bpagesS197. \endbibitem
- {binproceedings}[author] \bauthor\bsnmHans, \bfnmOTTO\binitsO. and \bauthor\bsnmSpacek, \bfnmA\binitsA. (\byear1960). \btitleRandom fixed point approximation by differentiable trajectories. In \bbooktitleTrans. 2nd Prague Conf. Information Theory \bpages203–213. \bpublisherPubl. House Czechoslovak Acad. Sci Prague. \endbibitem
- {bbook}[author] \bauthor\bsnmHiggins, \bfnmJames J\binitsJ. J. (\byear2004). \btitleAn introduction to modern nonparametric statistics. \bpublisherBrooks/Cole Pacific Grove, CA. \endbibitem
- {binproceedings}[author] \bauthor\bsnmHodges, \bfnmJoseph L\binitsJ. L. and \bauthor\bsnmLehmann, \bfnmErich Leo\binitsE. L. (\byear1956). \btitleTwo approximations to the Robbins-Monro process. In \bbooktitleProceedings of the Third Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Contributions to the Theory of Statistics \bvolume3 \bpages95–105. \bpublisherUniversity of California Press. \endbibitem
- {binproceedings}[author] \bauthor\bsnmIooss, \bfnmBertrand\binitsB. and \bauthor\bsnmLonchampt, \bfnmJérôme\binitsJ. (\byear2021). \btitleRobust tuning of Robbins-Monro algorithm for quantile estimation – Application to wind-farm asset management. In \bbooktitleESREL 2021. \endbibitem
- {barticle}[author] \bauthor\bsnmJones, \bfnmLynette A.\binitsL. A. and \bauthor\bsnmTan, \bfnmHong Z.\binitsH. Z. (\byear2013). \btitleApplication of Psychophysical Techniques to Haptic Research. \bjournalIEEE Transactions on Haptics \bvolume6 \bpages268-284. \bdoi10.1109/TOH.2012.74 \endbibitem
- {barticle}[author] \bauthor\bsnmJoseph, \bfnmV Roshan\binitsV. R. (\byear2004). \btitleEfficient Robbins–Monro procedure for binary data. \bjournalBiometrika \bvolume91 \bpages461–470. \endbibitem
- {barticle}[author] \bauthor\bsnmKallianpur, \bfnmGopinath\binitsG. (\byear1954). \btitleA note on the Robbins-Monro stochastic approximation method. \bjournalThe Annals of Mathematical Statistics \bvolume25 \bpages386–388. \endbibitem
- {barticle}[author] \bauthor\bsnmKesten, \bfnmHarry\binitsH. (\byear1958). \btitleAccelerated stochastic approximation. \bjournalThe Annals of Mathematical Statistics \bpages41–59. \endbibitem
- {barticle}[author] \bauthor\bsnmKrishnaiah, \bfnmPR\binitsP. (\byear1969). \btitleSimultaneous test procedures under general MANOVA models. \bjournalMultivariate analysis-II. \endbibitem
- {bbook}[author] \bauthor\bsnmKushner, \bfnmH J\binitsH. J. and \bauthor\bsnmYin, \bfnmG\binitsG. (\byear2003). \btitleStochastic approximation and recursive algorithms and applications. \bpublisherSpringer. \endbibitem
- {barticle}[author] \bauthor\bsnmLai, \bfnmTze Leung\binitsT. L. (\byear2003). \btitleStochastic approximation. \bjournalThe Annals of Statistics \bvolume31 \bpages391–406. \endbibitem
- {bbook}[author] \bauthor\bsnmLjung, \bfnmLennart\binitsL., \bauthor\bsnmPflug, \bfnmGeorg\binitsG. and \bauthor\bsnmWalk, \bfnmHarro\binitsH. (\byear2012). \btitleStochastic approximation and optimization of random systems \bvolume17. \bpublisherBirkhäuser. \endbibitem
- {barticle}[author] \bauthor\bsnmMarti, \bfnmKurt\binitsK. (\byear2003). \btitleStochastic optimization methods in optimal engineering design under stochastic uncertainty. \bjournalJournal of Applied Mathematics and Mechanics / Zeitschrift für Angewandte Mathematik und Mechanik (ZAMM) \bvolume83 \bpages795–811. \endbibitem
- {barticle}[author] \bauthor\bsnmMoulines, \bfnmEric\binitsE. and \bauthor\bsnmBach, \bfnmFrancis\binitsF. (\byear2011). \btitleNon-asymptotic analysis of stochastic approximation algorithms for machine learning. \bjournalAdvances in neural information processing systems \bvolume24. \endbibitem
- {binproceedings}[author] \bauthor\bsnmNemirovski, \bfnmArkadi\binitsA. and \bauthor\bsnmYudin, \bfnmD\binitsD. (\byear1978). \btitleOn Cezari’s convergence of the steepest descent method for approximating saddle point of convex-concave functions. In \bbooktitleSoviet Mathematics. Doklady \bvolume19 \bpages258–269. \endbibitem
- {bbook}[author] \bauthor\bsnmNemirovskij, \bfnmArkadij Semenovič\binitsA. S. and \bauthor\bsnmYudin, \bfnmDavid Borisovich\binitsD. B. (\byear1983). \btitleProblem complexity and method efficiency in optimization. \bpublisherWiley-Interscience. \endbibitem
- {barticle}[author] \bauthor\bsnmPolyak, \bfnmBT\binitsB. (\byear1976). \btitleConvergence and convergence rate of iterative stochastic algorithms. 1. general case. \bjournalAutomation and Remote Control \bvolume37 \bpages1858–1868. \endbibitem
- {barticle}[author] \bauthor\bsnmRobbins, \bfnmHerbert\binitsH. and \bauthor\bsnmMonro, \bfnmSutton\binitsS. (\byear1951). \btitleA stochastic approximation method. \bjournalThe Annals of Mathematical Statistics \bpages400–407. \endbibitem
- {barticle}[author] \bauthor\bsnmRobbins, \bfnmHerbert\binitsH. and \bauthor\bsnmSiegmund, \bfnmDavid\binitsD. (\byear1971). \btitleA convergence theorem for non negative almost supermartingales and some applications. \bjournalOptimizing Methods in Statistics \bpages233–257. \endbibitem
- {barticle}[author] \bauthor\bsnmRuppert, \bfnmDavid\binitsD. (\byear1985). \btitleA Newton-Raphson version of the multivariate Robbins-Monro procedure. \bjournalThe Annals of Statistics \bvolume13 \bpages236–245. \endbibitem
- {barticle}[author] \bauthor\bsnmSacks, \bfnmJerome\binitsJ. (\byear1958). \btitleAsymptotic distribution of stochastic approximation procedures. \bjournalThe Annals of Mathematical Statistics \bvolume29 \bpages373–405. \endbibitem
- {barticle}[author] \bauthor\bsnmSielken Jr, \bfnmRobert L\binitsR. L. and \bauthor\bsnmSTATISTICS, \bfnmFLORIDA STATE UNIV TALLAHASSEE DEPT OF\binitsF. S. U. T. D. O. (\byear1973). \btitleSome stopping times for stochastic approximation procedures. \bjournalZ. Wahrscheinlichkeitstheorie verw. Gebiete \bvolume27 \bpages79–86. \endbibitem
- {barticle}[author] \bauthor\bsnmSorenson, \bfnmHarold W\binitsH. W. and \bauthor\bsnmAlspach, \bfnmDaniel L\binitsD. L. (\byear1971). \btitleRecursive Bayesian estimation using Gaussian sums. \bjournalAutomatica \bvolume7 \bpages465–479. \endbibitem
- {barticle}[author] \bauthor\bsnmStroup, \bfnmDonna F\binitsD. F. and \bauthor\bsnmBraun, \bfnmHenry I\binitsH. I. (\byear1982). \btitleOn a new stopping rule for stochastic approximation. \bjournalZeitschrift für Wahrscheinlichkeitstheorie und verwandte Gebiete \bvolume60 \bpages535–554. \endbibitem
- {barticle}[author] \bauthor\bsnmToulis, \bfnmPanos\binitsP., \bauthor\bsnmHorel, \bfnmThibaut\binitsT. and \bauthor\bsnmAiroldi, \bfnmEdoardo M\binitsE. M. (\byear2021). \btitleThe proximal Robbins–Monro method. \bjournalJournal of the Royal Statistical Society Series B: Statistical Methodology \bvolume83 \bpages188–212. \endbibitem
- {barticle}[author] \bauthor\bsnmTreutwein, \bfnmBernhard\binitsB. (\byear1995). \btitleAdaptive psychophysical procedures. \bjournalVision Research \bvolume35 \bpages2503–2522. \endbibitem
- {barticle}[author] \bauthor\bsnmVenter, \bfnmJH\binitsJ. (\byear1967). \btitleAn extension of the Robbins-Monro procedure. \bjournalThe Annals of Mathematical Statistics \bvolume38 \bpages181–190. \endbibitem
- {barticle}[author] \bauthor\bsnmWada, \bfnmTakayuki\binitsT. and \bauthor\bsnmFujisaki, \bfnmYasumasa\binitsY. (\byear2015). \btitleA stopping rule for stochastic approximation. \bjournalAutomatica \bvolume60 \bpages1–6. \endbibitem
- {barticle}[author] \bauthor\bsnmWada, \bfnmTakayuki\binitsT., \bauthor\bsnmItani, \bfnmTakamitsu\binitsT. and \bauthor\bsnmFujisaki, \bfnmYasumasa\binitsY. (\byear2010). \btitleA stopping rule for linear stochastic approximation. \bjournalIEEE Conference on Decision and Control (CDC) \bvolume49 \bpages4171–4176. \endbibitem
- {barticle}[author] \bauthor\bsnmWang, \bfnmBoshuo\binitsB., \bauthor\bsnmPeterchev, \bfnmAngel V\binitsA. V. and \bauthor\bsnmGoetz, \bfnmStefan M\binitsS. M. (\byear2022). \btitleAnalysis and Comparison of Methods for Determining Motor Threshold with Transcranial Magnetic Stimulation. \bjournalbioRxiv \bpages495134. \bdoi10.1101/2022.06.26.495134 \endbibitem
- {barticle}[author] \bauthor\bsnmWassermann, \bfnmEric M\binitsE. M. (\byear2002). \btitleVariation in the response to transcranial magnetic brain stimulation in the general population. \bjournalClinical Neurophysiology \bvolume113 \bpages1165–1171. \endbibitem
- {barticle}[author] \bauthor\bsnmWei, \bfnmCZ\binitsC. (\byear1987). \btitleMultivariate adaptive stochastic approximation. \bjournalThe annals of statistics \bpages1115–1130. \endbibitem
- {barticle}[author] \bauthor\bsnmWu, \bfnmCF Jeff\binitsC. J. (\byear1985). \btitleEfficient sequential designs with binary data. \bjournalJournal of the American statistical Association \bvolume80 \bpages974–984. \endbibitem
- {barticle}[author] \bauthor\bsnmXiong, \bfnmCui\binitsC. and \bauthor\bsnmXu, \bfnmJin\binitsJ. (\byear2018). \btitleEfficient Robbins–Monro procedure for multivariate binary data. \bjournalStatistical Theory and Related Fields \bvolume2 \bpages172–180. \endbibitem
- {barticle}[author] \bauthor\bsnmYin, \bfnmG\binitsG. (\byear1988). \btitleA stopped stochastic approximation algorithm. \bjournalSystems & Control Letters \bvolume11 \bpages107–115. \endbibitem
- {binproceedings}[author] \bauthor\bsnmZhang, \bfnmTong\binitsT. (\byear2004). \btitleSolving large scale linear prediction problems using stochastic gradient descent algorithms. In \bbooktitleProceedings of the twenty-first international conference on machine learning \bpages116. \endbibitem
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.