An Online Learning Approach to Model Predictive Control (1902.08967v3)

Published 24 Feb 2019 in cs.RO and cs.LG

Abstract: Model predictive control (MPC) is a powerful technique for solving dynamic control tasks. In this paper, we show that there exists a close connection between MPC and online learning, an abstract theoretical framework for analyzing online decision making in the optimization literature. This new perspective provides a foundation for leveraging powerful online learning algorithms to design MPC algorithms. Specifically, we propose a new algorithm based on dynamic mirror descent (DMD), an online learning algorithm that is designed for non-stationary setups. Our algorithm, Dynamic Mirror Descent Model Predictive Control (DMD-MPC), represents a general family of MPC algorithms that includes many existing techniques as special instances. DMD-MPC also provides a fresh perspective on previous heuristics used in MPC and suggests a principled way to design new MPC algorithms. In the experimental section of this paper, we demonstrate the flexibility of DMD-MPC, presenting a set of new MPC algorithms on a simple simulated cartpole and a simulated and real-world aggressive driving task. Videos of the real-world experiments can be found at https://youtu.be/vZST3v0_S9w and https://youtu.be/MhuqiHo2t98.

Citations (68)

View on Semantic Scholar

Summary

The paper introduces a systematic framework that synthesizes MPC algorithms using an online learning approach based on Dynamic Mirror Descent.
The methodology integrates per-round loss minimization with Bregman divergence regularization to improve real-time control efficiency.
Experimental results demonstrate DMD-MPC's adaptability and effectiveness in both simulated tasks like the cartpole and real-world aggressive driving.

An Online Learning Approach to Model Predictive Control: A Detailed Examination

The paper "An Online Learning Approach to Model Predictive Control" contributes to the field of dynamic control systems by integrating concepts from online learning with Model Predictive Control (MPC). MPC is already a robust methodology used for managing control tasks in dynamic environments. The authors illustrate a significant conceptual linkage between MPC and online learning, suggesting that insights and methods from online learning can be exploited to design and improve MPC algorithms. This paper introduces a new algorithm, Dynamic Mirror Descent Model Predictive Control (DMD-MPC), which is fundamentally based on the dynamic mirror descent (DMD) online learning algorithm. This paper's principal novelty lies in providing a systematic framework for synthesizing MPC algorithms from an online learning perspective.

Framework and Methodology

The integration of MPC with online learning is achieved by formulating the MPC problem through the lens of online learning, particularly using the concepts of decision making and per-round losses. Online learning, in this context, serves as an ideal framework to iteratively optimize control over a series of adaptive rounds, where each round presents a new loss function that the MPC algorithm aims to minimize.

Dynamic Mirror Descent (DMD) operates within this framework by utilizing a Bregman divergence as a regularization term during optimization. This regularization is contextually meaningful; it maintains proximity to previous decisions while adapting to new information. The DMD-MPC algorithm expands on this concept, offering a flexible structure that subsumes several existing MPC techniques and provides a pathway for creating novel methods.

Experimental Results and Critical Insights

The paper features experiments on simulated control tasks like cartpole and a real-world aggressive driving task executed on the AutoRally platform. Through these experiments, the flexibility and effectiveness of the DMD-MPC framework are demonstrated. Of particular interest is the adaptability of the DMD-MPC algorithm to various control settings and constraints, including both continuous and discrete control scenarios.

Key observations from the experiments reveal that choice of step size and the structure of per-round losses significantly influence performance. The experiment in aggressive driving highlights the utility of DMD-MPC in real-time applications, showcasing its potential for high-speed autonomous vehicle control.

Implications and Future Directions

The implications of this research are manifold. Practically, the DMD-MPC framework provides a robust mechanism for adjusting to real-world dynamics in challenging control tasks. Theoretically, the paper opens avenues for employing advanced online learning algorithms to further enhance adaptability and precision within MPC. With guaranteed properties such as low dynamic regret, practitioners and theorists alike have a toolset that can systematically improve online decision processes across diverse application areas.

The work leaves open several key areas for future exploration, such as integrating more sophisticated online learning models, enhancing computational efficiency, and optimizing control distributions. Additionally, further research can explore dynamic regret minimization under varying degrees of system non-stationarity and simulation fidelity, feeding back into the broader goal of developing more intelligent and autonomous control systems.

In conclusion, by bridging between two powerful methodologies—MPC and online learning—the paper enriches both fields, laying the groundwork for enhanced dynamic decision-making systems. As AI and autonomous systems continue to evolve, the integration strategies proposed here may become increasingly vital. The DMD-MPC framework, with its adaptability and rigorous foundation, is poised to influence future developments and applications in control systems.

PDF Markdown

Related Papers

YouTube

Show All Videos