Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 134 tok/s
Gemini 2.5 Pro 41 tok/s Pro
GPT-5 Medium 26 tok/s Pro
GPT-5 High 22 tok/s Pro
GPT-4o 93 tok/s Pro
Kimi K2 205 tok/s Pro
GPT OSS 120B 426 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

A Robbins--Monro Sequence That Can Exploit Prior Information For Faster Convergence (2401.03206v1)

Published 6 Jan 2024 in cs.LG, cs.NA, math.NA, math.OC, math.PR, math.ST, stat.ME, stat.ML, and stat.TH

Abstract: We propose a new method to improve the convergence speed of the Robbins-Monro algorithm by introducing prior information about the target point into the Robbins-Monro iteration. We achieve the incorporation of prior information without the need of a -- potentially wrong -- regression model, which would also entail additional constraints. We show that this prior-information Robbins-Monro sequence is convergent for a wide range of prior distributions, even wrong ones, such as Gaussian, weighted sum of Gaussians, e.g., in a kernel density estimate, as well as bounded arbitrary distribution functions greater than zero. We furthermore analyse the sequence numerically to understand its performance and the influence of parameters. The results demonstrate that the prior-information Robbins-Monro sequence converges faster than the standard one, especially during the first steps, which are particularly important for applications where the number of function measurements is limited, and when the noise of observing the underlying function is large. We finally propose a rule to select the parameters of the sequence.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (50)
  1. {barticle}[author] \bauthor\bsnmAlspach, \bfnmDaniel\binitsD. and \bauthor\bsnmSorenson, \bfnmHarold\binitsH. (\byear1972). \btitleNonlinear Bayesian estimation using Gaussian sum approximations. \bjournalIEEE transactions on automatic control \bvolume17 \bpages439–448. \endbibitem
  2. {barticle}[author] \bauthor\bsnmBlum, \bfnmJulius R\binitsJ. R. (\byear1954). \btitleApproximation methods which converge with probability one. \bjournalThe Annals of Mathematical Statistics \bpages382–386. \endbibitem
  3. {barticle}[author] \bauthor\bsnmBlum, \bfnmJulius R\binitsJ. R. (\byear1954). \btitleMultidimensional stochastic approximation methods. \bjournalThe Annals of Mathematical Statistics \bpages737–744. \endbibitem
  4. {bbook}[author] \bauthor\bsnmBorkar, \bfnmVivek S\binitsV. S. (\byear2009). \btitleStochastic approximation: a dynamical systems viewpoint \bvolume48. \bpublisherSpringer. \endbibitem
  5. {barticle}[author] \bauthor\bsnmBottou, \bfnmLéon\binitsL., \bauthor\bsnmCurtis, \bfnmFrank E\binitsF. E. and \bauthor\bsnmNocedal, \bfnmJorge\binitsJ. (\byear2018). \btitleOptimization methods for large-scale machine learning. \bjournalSIAM review \bvolume60 \bpages223–311. \endbibitem
  6. {barticle}[author] \bauthor\bsnmChen, \bfnmYenChi\binitsY. (\byear2017). \btitleA tutorial on kernel density estimation and recent advances. \bjournalBiostatistics & Epidemiology \bvolume1 \bpages161–187. \endbibitem
  7. {barticle}[author] \bauthor\bsnmChung, \bfnmKaiLai\binitsK. (\byear1954). \btitleOn a stochastic approximation method. \bjournalThe Annals of Mathematical Statistics \bpages463–483. \endbibitem
  8. {binproceedings}[author] \bauthor\bsnmDriml, \bfnmMiloslav\binitsM. and \bauthor\bsnmNedoma, \bfnmJiří\binitsJ. (\byear1960). \btitleStochastic approximations for continuous random processes. In \bbooktitleTrans. of the second Prague conference on information theory \bpages145–148. \endbibitem
  9. {barticle}[author] \bauthor\bsnmEuler, \bfnmL\binitsL. (\byear1744). \btitleVariae observationes circa series infinitas. \bjournalCommentarii academiae scientiarum Petropolitanae \bvolume9 \bpages160–188. \endbibitem
  10. {barticle}[author] \bauthor\bsnmFabian, \bfnmVaclav\binitsV. (\byear1968). \btitleOn asymptotic normality in stochastic approximation. \bjournalThe Annals of Mathematical Statistics \bpages1327–1332. \endbibitem
  11. {barticle}[author] \bauthor\bsnmFarrell, \bfnmRH\binitsR. (\byear1962). \btitleBounded length confidence intervals for the zero of a regression function. \bjournalThe Annals of Mathematical Statistics \bpages237–247. \endbibitem
  12. {bbook}[author] \bauthor\bsnmFarrell, \bfnmRoger Hamlin\binitsR. H. (\byear1959). \btitleSequentially determined bounded length confidence intervals. \bpublisherUniversity of Illinois at Urbana-Champaign. \endbibitem
  13. {barticle}[author] \bauthor\bsnmGlynn, \bfnmPeter W\binitsP. W. and \bauthor\bsnmWhitt, \bfnmWard\binitsW. (\byear1992). \btitleThe asymptotic validity of sequential stopping rules for stochastic simulations. \bjournalThe Annals of Applied Probability \bvolume2 \bpages180–198. \endbibitem
  14. {barticle}[author] \bauthor\bsnmGötz, \bfnmS\binitsS., \bauthor\bsnmWhiting, \bfnmP\binitsP. and \bauthor\bsnmPeterchev, \bfnmA\binitsA. (\byear2011). \btitleThreshold estimation with transcranial magnetic stimulation: algorithm comparison. \bjournalClinical Neurophysiology \bvolume122 \bpagesS197. \endbibitem
  15. {binproceedings}[author] \bauthor\bsnmHans, \bfnmOTTO\binitsO. and \bauthor\bsnmSpacek, \bfnmA\binitsA. (\byear1960). \btitleRandom fixed point approximation by differentiable trajectories. In \bbooktitleTrans. 2nd Prague Conf. Information Theory \bpages203–213. \bpublisherPubl. House Czechoslovak Acad. Sci Prague. \endbibitem
  16. {bbook}[author] \bauthor\bsnmHiggins, \bfnmJames J\binitsJ. J. (\byear2004). \btitleAn introduction to modern nonparametric statistics. \bpublisherBrooks/Cole Pacific Grove, CA. \endbibitem
  17. {binproceedings}[author] \bauthor\bsnmHodges, \bfnmJoseph L\binitsJ. L. and \bauthor\bsnmLehmann, \bfnmErich Leo\binitsE. L. (\byear1956). \btitleTwo approximations to the Robbins-Monro process. In \bbooktitleProceedings of the Third Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Contributions to the Theory of Statistics \bvolume3 \bpages95–105. \bpublisherUniversity of California Press. \endbibitem
  18. {binproceedings}[author] \bauthor\bsnmIooss, \bfnmBertrand\binitsB. and \bauthor\bsnmLonchampt, \bfnmJérôme\binitsJ. (\byear2021). \btitleRobust tuning of Robbins-Monro algorithm for quantile estimation – Application to wind-farm asset management. In \bbooktitleESREL 2021. \endbibitem
  19. {barticle}[author] \bauthor\bsnmJones, \bfnmLynette A.\binitsL. A. and \bauthor\bsnmTan, \bfnmHong Z.\binitsH. Z. (\byear2013). \btitleApplication of Psychophysical Techniques to Haptic Research. \bjournalIEEE Transactions on Haptics \bvolume6 \bpages268-284. \bdoi10.1109/TOH.2012.74 \endbibitem
  20. {barticle}[author] \bauthor\bsnmJoseph, \bfnmV Roshan\binitsV. R. (\byear2004). \btitleEfficient Robbins–Monro procedure for binary data. \bjournalBiometrika \bvolume91 \bpages461–470. \endbibitem
  21. {barticle}[author] \bauthor\bsnmKallianpur, \bfnmGopinath\binitsG. (\byear1954). \btitleA note on the Robbins-Monro stochastic approximation method. \bjournalThe Annals of Mathematical Statistics \bvolume25 \bpages386–388. \endbibitem
  22. {barticle}[author] \bauthor\bsnmKesten, \bfnmHarry\binitsH. (\byear1958). \btitleAccelerated stochastic approximation. \bjournalThe Annals of Mathematical Statistics \bpages41–59. \endbibitem
  23. {barticle}[author] \bauthor\bsnmKrishnaiah, \bfnmPR\binitsP. (\byear1969). \btitleSimultaneous test procedures under general MANOVA models. \bjournalMultivariate analysis-II. \endbibitem
  24. {bbook}[author] \bauthor\bsnmKushner, \bfnmH J\binitsH. J. and \bauthor\bsnmYin, \bfnmG\binitsG. (\byear2003). \btitleStochastic approximation and recursive algorithms and applications. \bpublisherSpringer. \endbibitem
  25. {barticle}[author] \bauthor\bsnmLai, \bfnmTze Leung\binitsT. L. (\byear2003). \btitleStochastic approximation. \bjournalThe Annals of Statistics \bvolume31 \bpages391–406. \endbibitem
  26. {bbook}[author] \bauthor\bsnmLjung, \bfnmLennart\binitsL., \bauthor\bsnmPflug, \bfnmGeorg\binitsG. and \bauthor\bsnmWalk, \bfnmHarro\binitsH. (\byear2012). \btitleStochastic approximation and optimization of random systems \bvolume17. \bpublisherBirkhäuser. \endbibitem
  27. {barticle}[author] \bauthor\bsnmMarti, \bfnmKurt\binitsK. (\byear2003). \btitleStochastic optimization methods in optimal engineering design under stochastic uncertainty. \bjournalJournal of Applied Mathematics and Mechanics / Zeitschrift für Angewandte Mathematik und Mechanik (ZAMM) \bvolume83 \bpages795–811. \endbibitem
  28. {barticle}[author] \bauthor\bsnmMoulines, \bfnmEric\binitsE. and \bauthor\bsnmBach, \bfnmFrancis\binitsF. (\byear2011). \btitleNon-asymptotic analysis of stochastic approximation algorithms for machine learning. \bjournalAdvances in neural information processing systems \bvolume24. \endbibitem
  29. {binproceedings}[author] \bauthor\bsnmNemirovski, \bfnmArkadi\binitsA. and \bauthor\bsnmYudin, \bfnmD\binitsD. (\byear1978). \btitleOn Cezari’s convergence of the steepest descent method for approximating saddle point of convex-concave functions. In \bbooktitleSoviet Mathematics. Doklady \bvolume19 \bpages258–269. \endbibitem
  30. {bbook}[author] \bauthor\bsnmNemirovskij, \bfnmArkadij Semenovič\binitsA. S. and \bauthor\bsnmYudin, \bfnmDavid Borisovich\binitsD. B. (\byear1983). \btitleProblem complexity and method efficiency in optimization. \bpublisherWiley-Interscience. \endbibitem
  31. {barticle}[author] \bauthor\bsnmPolyak, \bfnmBT\binitsB. (\byear1976). \btitleConvergence and convergence rate of iterative stochastic algorithms. 1. general case. \bjournalAutomation and Remote Control \bvolume37 \bpages1858–1868. \endbibitem
  32. {barticle}[author] \bauthor\bsnmRobbins, \bfnmHerbert\binitsH. and \bauthor\bsnmMonro, \bfnmSutton\binitsS. (\byear1951). \btitleA stochastic approximation method. \bjournalThe Annals of Mathematical Statistics \bpages400–407. \endbibitem
  33. {barticle}[author] \bauthor\bsnmRobbins, \bfnmHerbert\binitsH. and \bauthor\bsnmSiegmund, \bfnmDavid\binitsD. (\byear1971). \btitleA convergence theorem for non negative almost supermartingales and some applications. \bjournalOptimizing Methods in Statistics \bpages233–257. \endbibitem
  34. {barticle}[author] \bauthor\bsnmRuppert, \bfnmDavid\binitsD. (\byear1985). \btitleA Newton-Raphson version of the multivariate Robbins-Monro procedure. \bjournalThe Annals of Statistics \bvolume13 \bpages236–245. \endbibitem
  35. {barticle}[author] \bauthor\bsnmSacks, \bfnmJerome\binitsJ. (\byear1958). \btitleAsymptotic distribution of stochastic approximation procedures. \bjournalThe Annals of Mathematical Statistics \bvolume29 \bpages373–405. \endbibitem
  36. {barticle}[author] \bauthor\bsnmSielken Jr, \bfnmRobert L\binitsR. L. and \bauthor\bsnmSTATISTICS, \bfnmFLORIDA STATE UNIV TALLAHASSEE DEPT OF\binitsF. S. U. T. D. O. (\byear1973). \btitleSome stopping times for stochastic approximation procedures. \bjournalZ. Wahrscheinlichkeitstheorie verw. Gebiete \bvolume27 \bpages79–86. \endbibitem
  37. {barticle}[author] \bauthor\bsnmSorenson, \bfnmHarold W\binitsH. W. and \bauthor\bsnmAlspach, \bfnmDaniel L\binitsD. L. (\byear1971). \btitleRecursive Bayesian estimation using Gaussian sums. \bjournalAutomatica \bvolume7 \bpages465–479. \endbibitem
  38. {barticle}[author] \bauthor\bsnmStroup, \bfnmDonna F\binitsD. F. and \bauthor\bsnmBraun, \bfnmHenry I\binitsH. I. (\byear1982). \btitleOn a new stopping rule for stochastic approximation. \bjournalZeitschrift für Wahrscheinlichkeitstheorie und verwandte Gebiete \bvolume60 \bpages535–554. \endbibitem
  39. {barticle}[author] \bauthor\bsnmToulis, \bfnmPanos\binitsP., \bauthor\bsnmHorel, \bfnmThibaut\binitsT. and \bauthor\bsnmAiroldi, \bfnmEdoardo M\binitsE. M. (\byear2021). \btitleThe proximal Robbins–Monro method. \bjournalJournal of the Royal Statistical Society Series B: Statistical Methodology \bvolume83 \bpages188–212. \endbibitem
  40. {barticle}[author] \bauthor\bsnmTreutwein, \bfnmBernhard\binitsB. (\byear1995). \btitleAdaptive psychophysical procedures. \bjournalVision Research \bvolume35 \bpages2503–2522. \endbibitem
  41. {barticle}[author] \bauthor\bsnmVenter, \bfnmJH\binitsJ. (\byear1967). \btitleAn extension of the Robbins-Monro procedure. \bjournalThe Annals of Mathematical Statistics \bvolume38 \bpages181–190. \endbibitem
  42. {barticle}[author] \bauthor\bsnmWada, \bfnmTakayuki\binitsT. and \bauthor\bsnmFujisaki, \bfnmYasumasa\binitsY. (\byear2015). \btitleA stopping rule for stochastic approximation. \bjournalAutomatica \bvolume60 \bpages1–6. \endbibitem
  43. {barticle}[author] \bauthor\bsnmWada, \bfnmTakayuki\binitsT., \bauthor\bsnmItani, \bfnmTakamitsu\binitsT. and \bauthor\bsnmFujisaki, \bfnmYasumasa\binitsY. (\byear2010). \btitleA stopping rule for linear stochastic approximation. \bjournalIEEE Conference on Decision and Control (CDC) \bvolume49 \bpages4171–4176. \endbibitem
  44. {barticle}[author] \bauthor\bsnmWang, \bfnmBoshuo\binitsB., \bauthor\bsnmPeterchev, \bfnmAngel V\binitsA. V. and \bauthor\bsnmGoetz, \bfnmStefan M\binitsS. M. (\byear2022). \btitleAnalysis and Comparison of Methods for Determining Motor Threshold with Transcranial Magnetic Stimulation. \bjournalbioRxiv \bpages495134. \bdoi10.1101/2022.06.26.495134 \endbibitem
  45. {barticle}[author] \bauthor\bsnmWassermann, \bfnmEric M\binitsE. M. (\byear2002). \btitleVariation in the response to transcranial magnetic brain stimulation in the general population. \bjournalClinical Neurophysiology \bvolume113 \bpages1165–1171. \endbibitem
  46. {barticle}[author] \bauthor\bsnmWei, \bfnmCZ\binitsC. (\byear1987). \btitleMultivariate adaptive stochastic approximation. \bjournalThe annals of statistics \bpages1115–1130. \endbibitem
  47. {barticle}[author] \bauthor\bsnmWu, \bfnmCF Jeff\binitsC. J. (\byear1985). \btitleEfficient sequential designs with binary data. \bjournalJournal of the American statistical Association \bvolume80 \bpages974–984. \endbibitem
  48. {barticle}[author] \bauthor\bsnmXiong, \bfnmCui\binitsC. and \bauthor\bsnmXu, \bfnmJin\binitsJ. (\byear2018). \btitleEfficient Robbins–Monro procedure for multivariate binary data. \bjournalStatistical Theory and Related Fields \bvolume2 \bpages172–180. \endbibitem
  49. {barticle}[author] \bauthor\bsnmYin, \bfnmG\binitsG. (\byear1988). \btitleA stopped stochastic approximation algorithm. \bjournalSystems & Control Letters \bvolume11 \bpages107–115. \endbibitem
  50. {binproceedings}[author] \bauthor\bsnmZhang, \bfnmTong\binitsT. (\byear2004). \btitleSolving large scale linear prediction problems using stochastic gradient descent algorithms. In \bbooktitleProceedings of the twenty-first international conference on machine learning \bpages116. \endbibitem

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.