Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
153 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Online Stochastic Gradient Descent with Arbitrary Initialization Solves Non-smooth, Non-convex Phase Retrieval (1910.12837v1)

Published 28 Oct 2019 in stat.ML, cs.IT, cs.LG, cs.NA, math.IT, math.NA, and math.OC

Abstract: In recent literature, a general two step procedure has been formulated for solving the problem of phase retrieval. First, a spectral technique is used to obtain a constant-error initial estimate, following which, the estimate is refined to arbitrary precision by first-order optimization of a non-convex loss function. Numerical experiments, however, seem to suggest that simply running the iterative schemes from a random initialization may also lead to convergence, albeit at the cost of slightly higher sample complexity. In this paper, we prove that, in fact, constant step size online stochastic gradient descent (SGD) converges from arbitrary initializations for the non-smooth, non-convex amplitude squared loss objective. In this setting, online SGD is also equivalent to the randomized Kaczmarz algorithm from numerical analysis. Our analysis can easily be generalized to other single index models. It also makes use of new ideas from stochastic process theory, including the notion of a summary state space, which we believe will be of use for the broader field of non-convex optimization.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (30)
  1. A Convergence Theory for Deep Learning via Over-Parameterization. arXiv:1811.03962.
  2. Phase retrieval meets statistical learning theory: A flexible convex relaxation. Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, AISTATS (2017).
  3. Fourier Phase Retrieval: Uniqueness and Algorithms. Applied and Numerical Harmonic Analysis (2017), 55–91.
  4. Phase Retrieval via Wirtinger Flow: Theory and Algorithms. IEEE Transactions on Information Theory 61, 4 (2015), 1985–2007.
  5. PhaseLift: Exact and stable signal recovery from magnitude measurements via convex programming. Communications on Pure and Applied Mathematics 66, 8 (2013), 1241–1274.
  6. Gradient Descent with Random Initialization: Fast Global Convergence for Nonconvex Phase Retrieval. arXiv:1803.07726.
  7. The Nonsmooth Landscape of Phase Retrieval. arXiv:1711.03247.
  8. Bridging the Gap between Constant Step Size Stochastic Gradient Descent and Markov Chains. arXiv:1707.06386.
  9. Solving (Most) of a Set of Quadratic Equalities: Composite Optimization for Robust Phase Retrieval. Information and Inference: A Journal of the IMA 8, 3 (09 2018), 471–529.
  10. Phase retrieval: Stability and recovery guarantees. Applied and Computational Harmonic Analysis 36, 3 (2014), 473–494.
  11. Fienup, J. R. Phase Retrieval Algorithms: A Comparison. Appl. Opt. 21, 15 (1982), 2758.
  12. Escaping from saddle points — online stochastic gradient for tensor decomposition. In Proceedings of The 28th Conference on Learning Theory (Paris, France, 03–06 Jul 2015), P. Grünwald, E. Hazan, and S. Kale, Eds., vol. 40 of Proceedings of Machine Learning Research, PMLR, pp. 797–842.
  13. PhaseMax: Convex Phase Retrieval via Basis Pursuit. IEEE Transactions on Information Theory 64, 4 (Apr 2018), 2675–2689.
  14. An Elementary Proof of Convex Phase Retrieval in the Natural Parameter Space via the Linear Program PhaseMax. arXiv:1611.03935.
  15. Exponential Line-crossing Inequalities. arXiv:1808.03204.
  16. Convergence of the Randomized Kaczmarz Method for Phase Retrieval. arXiv:1706.10291.
  17. How to Escape Saddle Points Efficiently. Journal of Geometric Analysis 26, 1 (Mar 2017), 231–251.
  18. Stochastic Gradient Descent Escapes Saddle Points Efficiently. arXiv:1902.04811.
  19. Online ICA: Understanding Global Dynamics of Nonconvex Optimization via Diffusion Processes. Advances in Neural Information Processing Systems 29 (2016), 4967–4975.
  20. Stochastic Gradient Descent as Approximate Bayesian Inference. Journal of Machine Learning Research 18 (2017), 1–35.
  21. A Mean Field View of the Landscape of Two-layer Neural Networks. Proceedings of the National Academy of Sciences of the United States of America 115, 33 (Apr 2018), E7665–E7671.
  22. Stochastic Gradient Descent, Weighted Sampling, and the Randomized Kaczmarz Algorithm. Mathematical Programming 155, 1-2 (2016), 549–573.
  23. Phase Retrieval with Application to Optical Imaging: A Contemporary Overview. IEEE Signal Processing Magazine 32, 3 (May 2015), 87–109.
  24. A Randomized Kaczmarz Algorithm with Exponential Convergence. Journal of Fourier Analysis and Applications 15, 2 (2009), 262–278.
  25. A Geometric Analysis of Phase Retrieval. Foundations of Computational Mathematics 18, 5 (Aug 2018), 1131–1198.
  26. Phase Retrieval via Randomized Kaczmarz: Theoretical Guarantees. Information and Inference: A Journal of the IMA (Apr 2018).
  27. Vershynin, R. High-Dimensional Probability. Cambridge University Press, 2018.
  28. Scalable Solvers of Random Quadratic Equations via Stochastic Truncated Amplitude Flow. IEEE Transactions on Signal Processing 65, 8 (may 2017), 1961–1974.
  29. Wei, K. Solving Systems of Phaseless Equations via Kaczmarz Methods: A Proof of Concept Study. Inverse Problems 31, 12 (Dec 2015), 125008.
  30. Reshaped Wirtinger Flow and Incremental Algorithm for Solving Quadratic System of Equations. NIPS Proceedings (2016), 2622–2630.
Citations (34)

Summary

We haven't generated a summary for this paper yet.