Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Fast Minimization of Expected Logarithmic Loss via Stochastic Dual Averaging (2311.02557v2)

Published 5 Nov 2023 in math.OC, cs.LG, and quant-ph

Abstract: Consider the problem of minimizing an expected logarithmic loss over either the probability simplex or the set of quantum density matrices. This problem includes tasks such as solving the Poisson inverse problem, computing the maximum-likelihood estimate for quantum state tomography, and approximating positive semi-definite matrix permanents with the currently tightest approximation ratio. Although the optimization problem is convex, standard iteration complexity guarantees for first-order methods do not directly apply due to the absence of Lipschitz continuity and smoothness in the loss function. In this work, we propose a stochastic first-order algorithm named $B$-sample stochastic dual averaging with the logarithmic barrier. For the Poisson inverse problem, our algorithm attains an $\varepsilon$-optimal solution in $\smash{\tilde{O}}(d2/\varepsilon2)$ time, matching the state of the art, where $d$ denotes the dimension. When computing the maximum-likelihood estimate for quantum state tomography, our algorithm yields an $\varepsilon$-optimal solution in $\smash{\tilde{O}}(d3/\varepsilon2)$ time. This improves on the time complexities of existing stochastic first-order methods by a factor of $d{\omega-2}$ and those of batch methods by a factor of $d2$, where $\omega$ denotes the matrix multiplication exponent. Numerical experiments demonstrate that empirically, our algorithm outperforms existing methods with explicit complexity guarantees.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (66)
  1. John L. Kelly. A new interpretation of information rate. Bell Syst. tech. j., 35(4):917–926, 1956.
  2. Asymptotic optimality and asymptotic equipartition properties of log-optimum investment. Ann. Probab., pages 876–898, 1988.
  3. The Kelly Capital Growth Investment Criterion: Theory and Practice, volume 3. World Scientific, 2011.
  4. Y. Vardi and D. Lee. From image deblurring to optimal investments: Maximum likelihood solutions for positive linear inverse problems. J. R. Stat. Soc. Series B Stat. Methodol., 55(3):569–598, 1998.
  5. Inverse Imaging with Poisson Data. 2053-2563. IOP Publishing, 2018.
  6. Thomas M. Cover. Universal portfolios. Math. Financ., 1(1):1–29, 1991.
  7. Open problem: Fast and optimal online portfolio selection. In Proc. 33rd Annu. Conf. Learning Theory, pages 3864–3869, 2020.
  8. Efficient and near-optimal online portfolio selection. arXiv preprint arXiv:2209.13932, 2022.
  9. Z. Hradil. Quantum-state estimation. Phys. Rev. A, 55, 1997.
  10. Maximizing products of linear forms, and the permanent of positive semidefinite matrices. Math. Program., 193(1):499–510, 2022.
  11. The computational complexity of linear optics. In Proc. 43rd Annu. ACM Symp. Theory of Computing, pages 333–342, 2011.
  12. Convergence of the exponentiated gradient method with Armijo line search. J. Optim. Theory Appl., 181(2):588–607, 2019.
  13. Yurii Nesterov. Lectures on Convex Optimization. Springer, Cham, CH, second edition, 2018.
  14. Guanghui Lan. First-order and Stochastic Optimization Methods for Machine Learning, volume 1. Springer, 2020.
  15. The ordered subsets mirror descent optimization method with applications to tomography. SIAM J. Optim., 12(1):79–108, 2001.
  16. Faster PET reconstruction with a stochastic primal-dual hybrid gradient method. In Wavelets and Sparsity XVII, volume 10394, page 103941O. SPIE, 2017.
  17. When does adaptivity help for quantum state learning? In IEEE 64th Annu. Symp. Foundations of Computer Science, 2023.
  18. Maximum-likelihood quantum state tomography by Cover’s method with non-asymptotic analysis. arXiv preprint arXiv:2110.00747, 2021.
  19. Maximum-likelihood quantum state tomography by soft-bayes. arXiv preprint arXiv:2012.15498, 2022.
  20. Faster stochastic first-order method for maximum-likelihood quantum state tomography. arXiv preprint arXiv:2211.12880, 2022.
  21. Fastest rates for stochastic mirror descent methods. Comput. Optim. Appl., 79(3):717–766, 2021.
  22. Stochastic mirror descent: Convergence analysis and adaptive variants via the mirror stochastic polyak stepsize. arXiv preprint arXiv:2110.15412, 2021.
  23. Stochastic primal-dual hybrid gradient algorithm with arbitrary sampling and imaging applications. SIAM J. Optim., 28(4):2783–2808, 2018.
  24. On the convergence of stochastic primal-dual hybrid gradient. SIAM J. Optim., 32(2):1288–1318, 2022.
  25. Point process estimation with mirror prox algorithms. Appl. Math. Optim., 82(3):919–947, 2020.
  26. A coordinate-descent primal-dual algorithm with large step size and possibly nonseparable functions. SIAM J. Optim., 29(1):100–134, 2019.
  27. Optimal distributed online prediction using mini-batches. J. Mach. Learn. Res., 13(1), 2012.
  28. New bounds for matrix multiplication: from alpha to omega. In Proc. Annu. ACM-SIAM Symp. Discrete Algorithms, pages 3792–3835, 2024.
  29. A set of level 3 basic linear algebra subprograms. ACM Trans. Math. Softw., 16(1):1–17, 1990.
  30. A descent lemma beyond Lipschitz gradient continuity: first-order methods revisited and applications. Math. Oper. Res., 42(2):330–348, 2017.
  31. Analysis of the Frank–Wolfe method for convex composite optimization involving a logarithmically-homogeneous barrier. Math. Program., 199(1):123–163, 2023.
  32. Data-dependent bounds for online portfolio selection without Lipschitzness and smoothness. In Adv. Neural Information Processing Systems, 2023a.
  33. On the generalization ability of on-line learning algorithms. IEEE Trans. Inf. Theory, 50(9):2050–2057, 2004.
  34. Ashok Cutkosky. Anytime online-to-batch, optimism and acceleration. In Proc. 36th Int. Conf. Machine Learning, volume 97, pages 1446–1454, 2019.
  35. L. A. Shepp and Y. Vardi. Maximum likelihood reconstruction for emission tomography. IEEE Trans. Med. Imaging, 1(2):113–122, 1982.
  36. Thomas M. Cover. An algorithm for maximizing expected log investment return. IEEE Trans. Inf. Theory, 30(2):369–373, 1984.
  37. Composite self-concordant minimization. J. Mach. Learn. Res., 16(12):371–416, 2015.
  38. Generalized self-concordant analysis of Frank–Wolfe algorithms. Math. Program., 198(1):255–323, 2023.
  39. Simple steps are all you need: Frank-Wolfe and generalized self-concordant functions. In Adv. Neural Information Processing Systems, volume 34, pages 5390–5401, 2021.
  40. A Newton Frank–Wolfe method for constrained self-concordant minimization. J. Glob. Optim., 83(2):273–299, 2022.
  41. Diluted maximum-likelihood algorithm for quantum tomography. Phys. Rev. A, 75(4):042108, 2007.
  42. Global convergence of diluted iterations in maximum-likelihood quantum tomography. Quantum Info. Comput., 14(11–12):966–980, 2014.
  43. Accelerated image reconstruction using ordered subsets of projection data. IEEE Trans. Med. Imaging, 13(4):601–609, 1994.
  44. A. I. Lvovsky. Iterative maximum-likelihood reconstruction in quantum homodyne tomography. J. Opt. B: Quantum and Semiclass. Opt., 6(6):S556, 2004.
  45. Scalable multiparticle entanglement of trapped ions. Nature, 438(7068):643–646, 2005.
  46. Experimental neural network enhanced quantum tomography. npj Quantum Inf., 6(1):20, 2020.
  47. Time-of-flight quantum tomography of an atom in an optical tweezer. Nat. Phys., 19(4):569–573, 2023.
  48. Alexander Meiburg. Inapproximability of positive semidefinite permanents and quantum state tomography. In IEEE 63rd Annu. Symp. Foundations of Computer Science, pages 58–68, 2022.
  49. Smoothness, low-noise and fast rates. In Adv. Neural Information Processing Systems, volume 23, 2010.
  50. Francesco Orabona. A modern introduction to online learning. arXiv preprint arXiv:1912.13213v6, 2023.
  51. Fast linear algebra is stable. Numer. Math., 108(1):59–91, 2007.
  52. Julia: A fresh approach to numerical computing. SIAM Review, 59(1):65–98, 2017.
  53. The Fourier reconstruction of a head section. IEEE Trans. Nuc. Sci, 21(3):21–43, 1974.
  54. Compressed sensing performance bounds under Poisson noise. IEEE Trans. Signal Process., 58(8):3990–4002, 2010.
  55. Yen-Huan Li. Online positron emission tomography by online portfolio selection. In IEEE Int. Conf. Acoustics, Speech and Signal Processing, pages 1110–1114, 2020.
  56. A general convergence result for mirror descent with Armijo line search. arXiv preprint arXiv:1805.12232, 2018.
  57. Superfast maximum-likelihood reconstruction for quantum tomography. Phys. Rev. A, 95:062336, 2017.
  58. Quantum state tomography with conditional generative adversarial networks. Phys. Rev. Lett., 127:140502, 2021.
  59. The tradeoffs of large scale learning. In Adv. Neural Information Processing Systems, volume 20, 2007.
  60. Interior-Point Polynomial Algorithms in Convex Programming. SIAM, Philadelphia, PA, 1994.
  61. Relatively smooth convex optimization by first-order methods, and applications. SIAM J. Optim., 28(1):333–354, 2018.
  62. Online self-concordant and relatively smooth minimization, with applications to online portfolio selection and learning quantum states. In Proc. 34th Int. Conf. Algorithmic Learning Theory, pages 1481–1483, 2023b. arXiv:2210.00997.
  63. Elad Hazan. Introduction to Online Convex Optimization. Found. Trends Opt., 2(3–4):157–325, 2016.
  64. Convex Optimization. Cambridge university press, 2004.
  65. Introduction to Matrix Analysis and Applications. Springer, Cham, 2014.
  66. Efficient online portfolio with logarithmic regret. In Adv. Neural Information Processing Systems, volume 31, 2018.

Summary

We haven't generated a summary for this paper yet.