Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
184 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Online Control of Linear Systems under Unbounded Noise (2402.10252v2)

Published 15 Feb 2024 in eess.SY, cs.LG, cs.SY, math.OC, and stat.ML

Abstract: This paper investigates the problem of controlling a linear system under possibly unbounded stochastic noise with unknown convex cost functions, known as an online control problem. In contrast to the existing work, which assumes the boundedness of noise, we show that an $ \tilde{O}(\sqrt{T}) $ high-probability regret can be achieved under unbounded noise, where $ T $ denotes the time horizon. Notably, the noise is only required to have a finite fourth moment. Moreover, when the costs are strongly convex and the noise is sub-Gaussian, we establish an $ O({\rm poly} (\log T)) $ regret bound.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (21)
  1. Tracking adversarial targets. In International Conference on Machine Learning, pp. 369–377. PMLR, 2014.
  2. Online control with adversarial disturbances. In International Conference on Machine Learning, pp. 111–119. PMLR, 2019a.
  3. Logarithmic regret for online control. Advances in Neural Information Processing Systems, 32:10175–10184, 2019b.
  4. Online learning for adversaries with memory: Price of past mistakes. Advances in Neural Information Processing Systems, 28:784–792, 2015.
  5. Convex Optimization. Cambridge University Press, 2004.
  6. Logarithmic regret for learning linear quadratic regulators efficiently. In International Conference on Machine Learning, pp. 1328–1337. PMLR, 2020.
  7. Online linear quadratic control. In International Conference on Machine Learning, pp. 1029–1038. PMLR, 2018.
  8. Logarithmic regret for adversarial online control. In International Conference on Machine Learning, pp. 3211–3221. PMLR, 2020.
  9. Regret-optimal estimation and control. IEEE Transactions on Automatic Control, 68(5):3041–3053, 2023.
  10. Hazan, E. Introduction to online convex optimization. Foundations and Trends® in Optimization, 2(3-4):157–325, 2016.
  11. Logarithmic regret algorithms for online convex optimization. Machine Learning, 69:169–192, 2007.
  12. The nonstochastic control problem. In Proceedings of the 31st International Conference on Algorithmic Learning Theory, pp.  408–421. PMLR, 2020.
  13. Stochastic model-based assessment of power systems subject to extreme wind power fluctuation. SICE Journal of Control, Measurement, and System Integration, 14(1):67–77, 2021.
  14. Stable process approach to analysis of systems under heavy-tailed noise: Modeling and stochastic linearization. IEEE Transactions on Automatic Control, 64(4):1344–1357, 2019.
  15. Klenke, A. Probability Theory: A Comprehensive Course. Springer Cham, third edition, 2020.
  16. Stochastic control of light UAV at landing with the aid of bearing-only observations. In Eighth International Conference on Machine Vision, volume 9875, pp.  474–483. SPIE, 2015.
  17. Orabona, F. A modern introduction to online learning. arXiv preprint arXiv:1912.13213, 2023.
  18. Uncertainty quantication and control during mars powered descent and landing using covariance steering. In 2018 AIAA Guidance, Navigation, and Control Conference, pp.  0611, 2018.
  19. Non-Gaussian power grid frequency fluctuations characterized by Lévy-stable laws and superstatistics. Nature Energy, 3(2):119–126, 2018.
  20. Naive exploration is optimal for online LQR. In International Conference on Machine Learning, pp. 8937–8948. PMLR, 2020.
  21. Improper learning for non-stochastic control. In Conference on Learning Theory, pp.  3320–3436. PMLR, 2020.

Summary

  • The paper establishes that for convex cost functions, online control algorithms can achieve sublinear O(√T) regret even under unbounded noise conditions.
  • It demonstrates that with strong convexity, a logarithmic O(log T) regret bound is obtained by addressing degenerate noise covariance via a novel transformation technique.
  • This advancement broadens online control applicability to real-world dynamic systems with unpredictable noise while reducing algorithm parameter complexity.

Online Control of Linear Systems under Unbounded and Degenerate Noise Conditions

Introduction

Online control comprises a class of problems central to the operation of dynamic systems, adapting actions in real-time based on evolving circumstances and objectives. This field intersects with machine learning through the concept of regret minimization, which is the foundation of many online learning algorithms. Traditional studies in online control have often worked under assumptions of bounded noise within systems and non-degenerate noise covariance structures. However, these constraints significantly limit applicability to real-world scenarios, where noise can be both unbounded and degenerate. Addressing this gap, we explore the landscape of online control without these limitations.

Contributions

The paper introduces a significant advance in the field of online control problems for linear systems subject to potentially unbounded and degenerate stochastic noise. It builds upon the regret minimization framework, particularly focusing on systems with unknown future costs. The primary contributions include:

  1. Establishing that for general convex costs, sublinear regret bounds are attainable even with unbounded noise, diverging from previous literature that chiefly considered bounded noise conditions. Specifically, we observe an O~(T)\widetilde{O}(\sqrt{T}) regret bound for convex costs, improving upon earlier results which relied on more restrictive assumptions.
  2. In cases where cost functions exhibit strong convexity, we derive a regret bound of O(poly(logT))O(\text{poly}(\log T)), extending the theory to situations where noise covariance is degenerate. This result is novel, as existing studies presupposed non-degenerate noise covariance to obtain logarithmic regret bounds.
  3. A transformation technique related to the noise's covariate structure plays a crucial role in both extending the regret bounds under broader conditions and facilitating a reduction in the algorithm's parameter complexity. This adjustment is critical for practical applications, particularly for large-scale systems where computational efficiency is paramount.

Theoretical Implications and Practical Relevance

The outcomes of this paper shed light on the inherent adaptability and robustness of online control strategies in the face of stochastic disturbances that might not adhere to conventional boundedness assumptions. From a theoretical standpoint, the analysis catalyzes further exploration into online control paradigms under more realistic conditions, fostering a deeper understanding of system behaviors in stochastic environments.

Practically, the findings have implications for a wide array of applications, including but not limited to, automated control systems in vehicles, energy management systems, and network traffic control, where the nature of disturbance noise can be highly unpredictable and exhibit tail-heavy distributions. By establishing stronger regret bounds under these challenging conditions, the paper underpins the development of more resilient and efficient online control algorithms.

Looking Ahead

Future research directions could include extending the demonstrated regret bounds to online control problems without full knowledge of system dynamics or in partially observed settings. Such expansions would be invaluable for developing more versatile and robust algorithms, capable of operating under uncertainty and incomplete information – conditions that frequently arise in complex real-world systems.

Conclusion

Our investigation into online control under unbounded and degenerate noise not only challenges existing paradigms but also opens up new avenues for designing algorithms that are both theoretically sound and practically applicable. This work underscores the potential of leveraging advanced analytical techniques to enhance the performance and applicability of online control strategies in managing dynamic systems amidst uncertainty.