Predictive Linear Online Tracking for Unknown Targets (2402.10036v3)
Abstract: In this paper, we study the problem of online tracking in linear control systems, where the objective is to follow a moving target. Unlike classical tracking control, the target is unknown, non-stationary, and its state is revealed sequentially, thus, fitting the framework of online non-stochastic control. We consider the case of quadratic costs and propose a new algorithm, called predictive linear online tracking (PLOT). The algorithm uses recursive least squares with exponential forgetting to learn a time-varying dynamic model of the target. The learned model is used in the optimal policy under the framework of receding horizon control. We show the dynamic regret of PLOT scales with $\mathcal{O}(\sqrt{TV_T})$, where $V_T$ is the total variation of the target dynamics and $T$ is the time horizon. Unlike prior work, our theoretical results hold for non-stationary targets. We implement PLOT on a real quadrotor and provide open-source software, thus, showcasing one of the first successful applications of online control methods on real hardware.
- Tracking adversarial targets. In International Conference on Machine Learning, pages 369–377. PMLR, 2014.
- Online control with adversarial disturbances. In International Conference on Machine Learning, pages 111–119. PMLR, 2019a.
- Logarithmic regret for online control. Advances in Neural Information Processing Systems, 32, 2019b.
- Online learning for time series prediction. In Conference on learning theory, pages 172–184. PMLR, 2013.
- Optimal Filtering. Dover Publications, 2005.
- A historical perspective of adaptive control and learning. Annual Reviews in Control, 52:18–41, 2021.
- Theory and applications of self-tuning regulators. Automatica, 13(5):457–476, 1977.
- Drone-assisted collection of environmental DNA from tree branches for biodiversity monitoring. Science Robotics, 8(74):eadd5762, 2023.
- Relative loss bounds for on-line density estimation with the exponential family of distributions. Machine learning, 43:211–246, 2001.
- Optimal dynamic regret in LQR control. Advances in Neural Information Processing Systems, 35:24879–24892, 2022.
- Regret analysis of online gradient descent-based iterative learning control with model mismatch. In 2022 IEEE 61st Conference on Decision and Control (CDC), pages 1479–1484. IEEE, 2022.
- Paul N Beuchat. N-rotor vehicles: modelling, control, and estimation. 2019.
- Bitcraze. Crazyradio, 2023a. URL https://www.bitcraze.io/products/crazyradio-2-0/.
- Bitcraze. Crazyflie 2.1, 2023b. URL https://www.bitcraze.io/products/crazyflie-2-1/.
- A survey of iterative learning control. IEEE control systems magazine, 26(3):96–114, 2006.
- On the PID tracking control of robot manipulators. Systems & control letters, 42(1):37–46, 2001.
- Prediction, learning, and games. Cambridge university press, 2006.
- Black-box control for linear dynamical systems. In Conference on Learning Theory, pages 1114–1143. PMLR, 2021.
- A review on the use of drones for precision agriculture. 275(1):012022, 2019.
- Discounted online Newton method for time-varying time series prediction. In 2021 American Control Conference (ACC), pages 1547–1552. IEEE, 2021.
- Hunting drones with other drones: Tracking a moving radio target. In 2019 International Conference on Robotics and Automation (ICRA), pages 1905–1912. IEEE, 2019.
- Logarithmic regret for adversarial online control. In International Conference on Machine Learning, pages 3211–3221. PMLR, 2020.
- No-regret prediction in marginally stable systems. In Conference on Learning Theory, pages 1714–1757. PMLR, 2020.
- The power of linear controllers in LQR control. In 2022 IEEE 61st Conference on Decision and Control (CDC), pages 6652–6657. IEEE, 2022.
- Regret-optimal estimation and control. IEEE Transactions on Automatic Control, 68(5):3041–3053, 2023.
- Adaptive regret for control of time-varying dynamics. In Learning for Dynamics and Control Conference, pages 560–572. PMLR, 2023.
- Performance analysis of general tracking algorithms. IEEE Transactions on Automatic Control, 40(8):1388–1402, 1995.
- Efficient learning algorithms for changing environments. In Proceedings of the 26th annual international conference on machine learning, pages 393–400, 2009.
- Introduction to online nonstochastic control. arXiv preprint arXiv:2211.09619, 2022.
- Dual adaptive model predictive control. Automatica, 80:340–348, 2017.
- Online learning under delayed feedback. In International Conference on Machine Learning, pages 1453–1461. PMLR, 2013.
- Sham Machandranath Kakade. On the sample complexity of reinforcement learning. 2003.
- Online linear quadratic tracking with regret guarantees. IEEE Control Systems Letters (L-CSS), 2023.
- A nonlinear tracking model predictive control scheme for dynamic target signals. Automatica, 118:109030, 2020.
- On-line learning of linear dynamical systems: Exponential forgetting in Kalman filters. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 33, pages 4098–4105, 2019.
- Online optimal control with linear dynamics and predictions: Algorithms and regret analysis. Advances in Neural Information Processing Systems, 32, 2019.
- Online optimal control with affine constraints. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 8527–8537, 2021.
- Adaptation and tracking in system identification—a survey. Automatica, 26(1):7–21, 1990.
- Linear offset-free model predictive control. Automatica, 45(10):2214–2222, 2009.
- Certainty equivalence is efficient for linear quadratic control. Advances in Neural Information Processing Systems, 32, 2019.
- Online control of unknown time-varying dynamical systems. Advances in Neural Information Processing Systems, 34:15934–15945, 2021.
- Linear quadratic tracking control of partially-unknown continuous-time systems using reinforcement learning. IEEE Transactions on Automatic control, 59(11):3051–3056, 2014.
- Adaptive regulation: reference tracking and disturbance rejection, volume 491. Springer Nature, 2022.
- Online reference tracking for linear systems with unknown dynamics and unknown disturbances. Transactions on Machine Learning Research, 2023.
- Online convex optimization for constrained control of linear systems using a reference governor. IFAC-PapersOnLine, 56(2):2570–2575, 2023.
- D. H. Owens and J. Hätönen. Iterative learning control—an optimization paradigm. Annual reviews in control, 29(1):57–70, 2005.
- Efficient PID tracking control of robotic manipulators driven by compliant actuators. IEEE Transactions on Control Systems Technology, 27(2):915–922, 2018.
- Combined design of disturbance model and observer for offset-free model predictive control. IEEE Transactions on Automatic Control, 52(6):1048–1053, 2007.
- An explicit dual control approach for constrained reference tracking of uncertain linear systems. IEEE Transactions on Automatic Control, 2022.
- B Peterson and K Narendra. Bounded error adaptive control. IEEE Transactions on Automatic Control, 27(6):1161–1168, 1982.
- ROS: an open-source robot operating system. In ICRA workshop on open source software, volume 3, page 5. Kobe, Japan, 2009.
- SLIP: Learning to predict in unknown dynamical systems with long-term memory. Advances in Neural Information Processing Systems, 33:5716–5728, 2020.
- So You Think You Can Dance? Rhythmic Flight Performances with Quadrocopters, pages 73–105. 01 2014. ISBN 978-3-319-03903-9. doi: 10.1007/978-3-319-03904-6˙4.
- Max Simchowitz. Making non-stochastic control (almost) as easy as stochastic. Advances in Neural Information Processing Systems, 33:18318–18329, 2020.
- Naive exploration is optimal for online LQR. In International Conference on Machine Learning, pages 8937–8948. PMLR, 2020.
- Dual adaptive MPC for output tracking of linear systems. In 2019 IEEE 58th Conference on Decision and Control (CDC), pages 1377–1382. IEEE, 2019.
- Machine learning for mechanical ventilation control. arXiv preprint arXiv:2102.06779, 2021.
- Online learning of the Kalman filter with logarithmic regret. IEEE Transactions on Automatic Control, 68(5):2774–2789, 2022.
- Kyriakos G Vamvoudakis. Optimal trajectory output tracking control with a Q-learning algorithm. In 2016 American Control Conference (ACC), pages 5752–5757. IEEE, 2016.
- The power of predictions in online control. Advances in Neural Information Processing Systems, 33:1994–2004, 2020.
- Trading-off static and dynamic regret in online least-squares and beyond. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, pages 6712–6719, 2020.
- Policy optimization for ℋ2subscriptℋ2\mathcal{H}_{2}caligraphic_H start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT linear control with ℋ∞subscriptℋ\mathcal{H}_{\infty}caligraphic_H start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT robustness guarantee: Implicit regularization and global convergence. SIAM Journal on Control and Optimization, 59(6):4081–4109, 2021a.
- On the regret analysis of online LQR control with predictions. In 2021 American Control Conference (ACC), pages 697–703. IEEE, 2021b.
- Adversarial tracking control via strongly adaptive online learning with memory. In International Conference on Artificial Intelligence and Statistics, pages 8458–8492. PMLR, 2022.
- Non-stationary online learning with memory and non-stochastic control. In International Conference on Artificial Intelligence and Statistics, pages 2101–2133. PMLR, 2022.
- Safe non-stochastic control of control-affine systems: An online convex optimization approach. IEEE Robotics and Automation Letters, 2023.
- Regret lower bounds for learning Linear Quadratic Gaussian systems. arXiv preprint arXiv:2201.01680, 2022.
- Martin Zinkevich. Online convex programming and generalized infinitesimal gradient ascent. In Proceedings of the 20th international conference on machine learning (icml-03), pages 928–936, 2003.