Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
156 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Regret Minimization in Scalar, Static, Non-linear Optimization Problems (2403.15344v2)

Published 22 Mar 2024 in math.OC, cs.SY, and eess.SY

Abstract: We study the problem of determining an effective exploration strategy in static and non-linear optimization problems, which depend on an unknown scalar parameter to be learned from online collected noisy data. An optimal trade-off between exploration and exploitation is crucial for effective optimization under uncertainties, and to achieve this we consider a cumulative regret minimization approach over a finite horizon, with each time instant in the horizon characterized by a stochastic exploration signal, whose variance is to be designed. We aim to extend the well-established concepts of regret minimization from linear to non-linear systems, with a focus on the subsequent conceptual differences and challenges. Thus, under an idealized assumption on an appropriately defined information function associated with the excitation, we are able to show that an optimal exploration strategy is either to use no exploration at all (called lazy exploration) or adding an exploration excitation only at the first time instant of the horizon (called immediate exploration). A quadratic numerical example is presented to demonstrate the effectiveness of the proposed strategy.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (25)
  1. H. Whitaker, J. Yamron, and A. Kezer, “Design of model-reference adaptive control systems for aircraft,” Report R-164, Instrumentation Laboratory, MIT, Cambridge, MA, Tech. Rep., 1958.
  2. G. Goodwin, P. Ramadge, and P. Caines, “Discrete-time multivariable adaptive control,” IEEE Trans. Automatic Control, vol. 25, no. 3, pp. 449–456, 1980.
  3. J. C. Doyle, K. Glover, P. P. Khargonekar, and B. A. Francis, “State-space solutions to standard h2subscriptℎ2h_{2}italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT and h∞subscriptℎh_{\infty}italic_h start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT control problems.” IEEE Trans. Automatic Control, vol. 34, no. 8, pp. 831–847, 1989.
  4. H. Hjalmarsson, “From experiment design to closed loop control,” Automatica, vol. 41, no. 3, pp. 393–438, March 2005.
  5. M. Gevers and L. Ljung, “Optimal experiment designs with respect to the intended model application,” Automatica, vol. 22, no. 5, pp. 543–554, 1986.
  6. X. Bombois, G. Scorletti, M. Gevers, P. M. J. Van den Hof, and R. Hildebrand, “Least costly identification experiment for control,” Automatica, vol. 42, no. 10, pp. 1651–1662, 2006.
  7. H. Hjalmarsson, “System identification of complex and structured systems,” European Journal of Control, vol. 15, no. 4, pp. 275–310, 2009, plenary address. European Control Conference.
  8. L. Gerencsér, H. Hjalmarsson, and J. Mårtensson, “Identification of ARX systems with non-stationary inputs - asymptotic analysis with application to adaptive input design,” Automatica, vol. 45, no. 3, pp. 623–633, March 2009.
  9. L. Gerencsér, H. Hjalmarsson, and L. Huang, “Adaptive input design for LTI systems,” IEEE Transactions on Automatic Control, vol. 62, no. 5, pp. 2390–2405, May 2016.
  10. T. L. Lai and C.-Z. Wei, “Extended least squares and their applications to adaptive control and prediction in linear systems,” IEEE Trans. Automatic Control, vol. 31, pp. 898–906, 1986.
  11. T. L. Lai, “Asymptotically efficient adaptive control in stochastic regression models,” Advances in Applied Mathematics, vol. 7, no. 1, pp. 23–45, 1986.
  12. H. Mania, S. Tu, and B. Recht, “Certainty equivalence is efficient for linear quadratic control,” in NeurIPS, 2019.
  13. M. Simchowitz and D. Foster, “Naive exploration is optimal for online LQR,” in Proceedings of the 37th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, H. D. III and A. Singh, Eds., vol. 119.   PMLR, 13–18 Jul 2020, pp. 8937–8948.
  14. F. Wang and L. Janson, “Exact asymptotics for linear quadratic adaptive control,” Journal of Machine Learning Research, vol. 22, no. 265, pp. 1–112, 2021.
  15. Y. Jedra and A. Proutiere, “Minimal expected regret in linear quadratic control,” in International Conference on Artificial Intelligence and Statistics.   PMLR, 2022, pp. 10 234–10 321.
  16. M. Forgione, X. Bombois, and P. V. den Hof, “Data-driven model improvement for model-based control,” Automatica, vol. 52, pp. 118–124, 2015.
  17. K. Colin, H. Hjalmarsson, and X. Bombois, “Optimal exploration strategies for finite horizon regret minimization in some adaptive control problems,” IFAC-PapersOnLine, vol. 56, no. 2, pp. 2564–2569, 2023.
  18. A. G. Marchetti, G. François, T. Faulwasser, and D. Bonvin, “Modifier adaptation for real-time optimization—methods and applications,” Processes, vol. 4, no. 4, p. 55, 2016.
  19. B. Srinivasan and D. Bonvin, “110th anniversary: a feature-based analysis of static real-time optimization schemes,” Industrial & Engineering Chemistry Research, vol. 58, no. 31, pp. 14 227–14 238, 2019.
  20. E. A. del Rio Chanona, P. Petsagkourakis, E. Bradford, J. A. Graciano, and B. Chachuat, “Real-time optimization meets bayesian optimization and derivative-free optimization: A tale of modifier adaptation,” Computers & Chemical Engineering, vol. 147, p. 107249, 2021.
  21. M. Pasquini and H. Hjalmarsson, “E2-RTO: An exploitation-exploration approach for real time optimization,” IFAC-PapersOnLine, vol. 56, no. 2, pp. 1423–1430, 2023.
  22. K. Colin, H. Hjalmarsson, and X. Bombois, “Finite-time regret minimization for linear quadratic adaptive controllers: an experiment design approach,” 2023, Available on HAL with id hal-04360490.
  23. Y. Wang, M. Pasquini, V. Chotteau, H. Hjalmarsson, and E. W. Jacobsen, “Iterative learning robust optimization - with application to medium optimization of CHO cell cultivation in continuous monoclonal antibody production,” Journal of Process Control, vol. 137, p. 103196, 2024.
  24. B. Efron and D. V. Hinkley, “Assessing the accuracy of the maximum likelihood estimator: Observed versus expected Fisher information,” Biometrika, vol. 65, no. 3, pp. 457–482, 1978.
  25. D. A. S. Fraser, “Ancillaries and conditional inference,” Statistical Science, vol. 19, no. 2, pp. 333–351, 2004.

Summary

We haven't generated a summary for this paper yet.