Control-Oriented Identification for the Linear Quadratic Regulator: Technical Report (2403.05455v2)
Abstract: Data-driven control benefits from rich datasets, but constructing such datasets becomes challenging when gathering data is limited. We consider an offline experiment design approach to gathering data where we design a control input to collect data that will most improve the performance of a feedback controller. We show how such a control-oriented approach can be used in a setting with linear dynamics and quadratic objective and, through design of a gradient estimator, solve the problem via stochastic gradient descent. We show our formulation numerically outperforms an A- and L-optimal experiment design approach as well as a robust dual control approach.
- H. Chernoff, “Locally Optimal Designs for Estimating Parameters,” The Annals of Mathematical Statistics, vol. 24, no. 4, pp. 586–602, Dec. 1953, publisher: Institute of Mathematical Statistics.
- H. A. Simon, “Dynamic Programming Under Uncertainty with a Quadratic Criterion Function,” Econometrica, vol. 24, no. 1, pp. 74–81, 1956, publisher: [Wiley, Econometric Society].
- G. Elfving, “Optimum Allocation in Linear Regression Theory,” The Annals of Mathematical Statistics, vol. 23, no. 2, pp. 255–262, Jun. 1952, publisher: Institute of Mathematical Statistics.
- K. Lindqvist and H. Hjalmarsson, “Identification for control: adaptive input design using convex optimization,” in Proceedings of the 40th IEEE Conference on Decision and Control (Cat. No.01CH37228), vol. 5. Orlando, FL, USA: IEEE, 2001, pp. 4326–4331.
- M. Gevers, “Identification for Control: From the Early Achievements to the Revival of Experiment Design,” in Proceedings of the 44th IEEE Conference on Decision and Control, Dec. 2005, pp. 12–12.
- S. Anderson, K. Byl, and J. P. Hespanha, “Experiment design with Gaussian process regression with applications to chance-constrained control,” in 2023 62nd IEEE Conference on Decision and Control (CDC), 2023, pp. 3931–3938.
- B. D. Lee, I. Ziemann, A. Tsiamis, H. Sandberg, and N. Matni, “The fundamental limitations of learning linear-quadratic regulators,” in 2023 62nd IEEE Conference on Decision and Control (CDC). IEEE, 2023, pp. 4053–4060.
- S. Dean, H. Mania, N. Matni, B. Recht, and S. Tu, “On the Sample Complexity of the Linear Quadratic Regulator,” Foundations of Computational Mathematics, vol. 20, no. 4, pp. 633–679, Aug. 2020.
- M. Simchowitz and D. Foster, “Naive Exploration is Optimal for Online LQR,” in Proceedings of the 37th International Conference on Machine Learning. PMLR, Nov. 2020, pp. 8937–8948, iSSN: 2640-3498.
- J. Umenberger, M. Ferizbegovic, T. B. Schön, and H. k. Hjalmarsson, “Robust exploration in linear quadratic reinforcement learning,” in Advances in Neural Information Processing Systems, vol. 32. Curran Associates, Inc., 2019.
- M. Ferizbegovic, J. Umenberger, H. Hjalmarsson, and T. B. Schön, “Learning robust LQ-controllers using application oriented exploration,” IEEE Control Systems Letters, vol. 4, no. 1, pp. 19–24, 2020.
- J. Venkatasubramanian, J. Köhler, J. Berberich, and F. Allgöwer, “Robust dual control based on gain scheduling,” in 2020 59th IEEE Conference on Decision and Control (CDC), 2020, pp. 2270–2277.
- G. Rallo, S. Formentin, C. R. Rojas, and S. M. Savaresi, “Robust experiment design for virtual reference feedback tuning,” in 2018 IEEE Conference on Decision and Control (CDC), 2018, pp. 2271–2276.
- S. Mohamed, M. Rosca, M. Figurnov, and A. Mnih, “Monte Carlo Gradient Estimation in Machine Learning,” Sep. 2020.
- J. Peters and S. Schaal, “Reinforcement learning of motor skills with policy gradients,” Neural Networks, vol. 21, no. 4, pp. 682–697, 2008.
- K. Fan, Z. Wang, J. Beck, J. Kwok, and K. A. Heller, “Fast second order stochastic backpropagation for variational inference,” in Advances in Neural Information Processing Systems, vol. 28. Curran Associates, Inc., 2015.
- W. Levine and M. Athans, “On the optimal error regulation of a string of moving vehicles,” IEEE Transactions on Automatic Control, vol. 11, no. 3, pp. 355–361, Jul. 1966.
- S. Anderson and J. P. Hespanha, “Control-oriented identification for the linear quadratic regulator: Technical report,” Santa Barbara, Mar. 2024. [Online]. Available: https://arxiv.org/abs/2403.05455
- A. J. Wagenmaker, M. Simchowitz, and K. Jamieson, “Task-Optimal Exploration in Linear Dynamical Systems,” in Proceedings of the 38th International Conference on Machine Learning. PMLR, Jul. 2021, pp. 10 641–10 652, iSSN: 2640-3498.
- J. Bradbury, R. Frostig, P. Hawkins, M. J. Johnson, C. Leary, D. Maclaurin, G. Necula, A. Paszke, J. VanderPlas, S. Wanderman-Milne, and Q. Zhang, “JAX: composable transformations of Python+NumPy programs,” 2018.
Sponsored by Paperpile, the PDF & BibTeX manager trusted by top AI labs.
Get 30 days freePaper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.