Papers
Topics
Authors
Recent
2000 character limit reached

Control-Oriented Identification for the Linear Quadratic Regulator: Technical Report (2403.05455v2)

Published 8 Mar 2024 in eess.SY and cs.SY

Abstract: Data-driven control benefits from rich datasets, but constructing such datasets becomes challenging when gathering data is limited. We consider an offline experiment design approach to gathering data where we design a control input to collect data that will most improve the performance of a feedback controller. We show how such a control-oriented approach can be used in a setting with linear dynamics and quadratic objective and, through design of a gradient estimator, solve the problem via stochastic gradient descent. We show our formulation numerically outperforms an A- and L-optimal experiment design approach as well as a robust dual control approach.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (20)
  1. H. Chernoff, “Locally Optimal Designs for Estimating Parameters,” The Annals of Mathematical Statistics, vol. 24, no. 4, pp. 586–602, Dec. 1953, publisher: Institute of Mathematical Statistics.
  2. H. A. Simon, “Dynamic Programming Under Uncertainty with a Quadratic Criterion Function,” Econometrica, vol. 24, no. 1, pp. 74–81, 1956, publisher: [Wiley, Econometric Society].
  3. G. Elfving, “Optimum Allocation in Linear Regression Theory,” The Annals of Mathematical Statistics, vol. 23, no. 2, pp. 255–262, Jun. 1952, publisher: Institute of Mathematical Statistics.
  4. K. Lindqvist and H. Hjalmarsson, “Identification for control: adaptive input design using convex optimization,” in Proceedings of the 40th IEEE Conference on Decision and Control (Cat. No.01CH37228), vol. 5.   Orlando, FL, USA: IEEE, 2001, pp. 4326–4331.
  5. M. Gevers, “Identification for Control: From the Early Achievements to the Revival of Experiment Design,” in Proceedings of the 44th IEEE Conference on Decision and Control, Dec. 2005, pp. 12–12.
  6. S. Anderson, K. Byl, and J. P. Hespanha, “Experiment design with Gaussian process regression with applications to chance-constrained control,” in 2023 62nd IEEE Conference on Decision and Control (CDC), 2023, pp. 3931–3938.
  7. B. D. Lee, I. Ziemann, A. Tsiamis, H. Sandberg, and N. Matni, “The fundamental limitations of learning linear-quadratic regulators,” in 2023 62nd IEEE Conference on Decision and Control (CDC).   IEEE, 2023, pp. 4053–4060.
  8. S. Dean, H. Mania, N. Matni, B. Recht, and S. Tu, “On the Sample Complexity of the Linear Quadratic Regulator,” Foundations of Computational Mathematics, vol. 20, no. 4, pp. 633–679, Aug. 2020.
  9. M. Simchowitz and D. Foster, “Naive Exploration is Optimal for Online LQR,” in Proceedings of the 37th International Conference on Machine Learning.   PMLR, Nov. 2020, pp. 8937–8948, iSSN: 2640-3498.
  10. J. Umenberger, M. Ferizbegovic, T. B. Schön, and H. k. Hjalmarsson, “Robust exploration in linear quadratic reinforcement learning,” in Advances in Neural Information Processing Systems, vol. 32.   Curran Associates, Inc., 2019.
  11. M. Ferizbegovic, J. Umenberger, H. Hjalmarsson, and T. B. Schön, “Learning robust LQ-controllers using application oriented exploration,” IEEE Control Systems Letters, vol. 4, no. 1, pp. 19–24, 2020.
  12. J. Venkatasubramanian, J. Köhler, J. Berberich, and F. Allgöwer, “Robust dual control based on gain scheduling,” in 2020 59th IEEE Conference on Decision and Control (CDC), 2020, pp. 2270–2277.
  13. G. Rallo, S. Formentin, C. R. Rojas, and S. M. Savaresi, “Robust experiment design for virtual reference feedback tuning,” in 2018 IEEE Conference on Decision and Control (CDC), 2018, pp. 2271–2276.
  14. S. Mohamed, M. Rosca, M. Figurnov, and A. Mnih, “Monte Carlo Gradient Estimation in Machine Learning,” Sep. 2020.
  15. J. Peters and S. Schaal, “Reinforcement learning of motor skills with policy gradients,” Neural Networks, vol. 21, no. 4, pp. 682–697, 2008.
  16. K. Fan, Z. Wang, J. Beck, J. Kwok, and K. A. Heller, “Fast second order stochastic backpropagation for variational inference,” in Advances in Neural Information Processing Systems, vol. 28.   Curran Associates, Inc., 2015.
  17. W. Levine and M. Athans, “On the optimal error regulation of a string of moving vehicles,” IEEE Transactions on Automatic Control, vol. 11, no. 3, pp. 355–361, Jul. 1966.
  18. S. Anderson and J. P. Hespanha, “Control-oriented identification for the linear quadratic regulator: Technical report,” Santa Barbara, Mar. 2024. [Online]. Available: https://arxiv.org/abs/2403.05455
  19. A. J. Wagenmaker, M. Simchowitz, and K. Jamieson, “Task-Optimal Exploration in Linear Dynamical Systems,” in Proceedings of the 38th International Conference on Machine Learning.   PMLR, Jul. 2021, pp. 10 641–10 652, iSSN: 2640-3498.
  20. J. Bradbury, R. Frostig, P. Hawkins, M. J. Johnson, C. Leary, D. Maclaurin, G. Necula, A. Paszke, J. VanderPlas, S. Wanderman-Milne, and Q. Zhang, “JAX: composable transformations of Python+NumPy programs,” 2018.
Citations (2)

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.