Learning Control Barrier Functions from Expert Demonstrations (2004.03315v3)

Published 7 Apr 2020 in eess.SY, cs.LG, cs.SY, and math.OC

Abstract: Inspired by the success of imitation and inverse reinforcement learning in replicating expert behavior through optimal control, we propose a learning based approach to safe controller synthesis based on control barrier functions (CBFs). We consider the setting of a known nonlinear control affine dynamical system and assume that we have access to safe trajectories generated by an expert - a practical example of such a setting would be a kinematic model of a self-driving vehicle with safe trajectories (e.g., trajectories that avoid collisions with obstacles in the environment) generated by a human driver. We then propose and analyze an optimization-based approach to learning a CBF that enjoys provable safety guarantees under suitable Lipschitz smoothness assumptions on the underlying dynamical system. A strength of our approach is that it is agnostic to the parameterization used to represent the CBF, assuming only that the Lipschitz constant of such functions can be efficiently bounded. Furthermore, if the CBF parameterization is convex, then under mild assumptions, so is our learning process. We end with extensive numerical evaluations of our results on both planar and realistic examples, using both random feature and deep neural network parameterizations of the CBF. To the best of our knowledge, these are the first results that learn provably safe control barrier functions from data.

Citations (174)

View on Semantic Scholar

Summary

The paper presents an optimization-based method to learn control barrier functions from expert safe trajectories for nonlinear systems.
The methodology leverages Lipschitz continuity constraints and expert data to ensure scalability and strong safety guarantees.
Experimental validations on planar systems and aircraft collision avoidance highlight the practical benefits and theoretical contributions of the approach.

Learning Control Barrier Functions from Expert Demonstrations

The paper addresses the challenge of synthesizing safe controllers for nonlinear dynamical systems via the learning of Control Barrier Functions (CBFs) from expert demonstrations. The authors focus on a specific problem framework: given a known control affine nonlinear system and a collection of safe trajectories from an expert—such as a kinematic model of a self-driving vehicle avoiding obstacles—they propose an optimization-based methodology to learn a CBF that provides provable safety guarantees.

Problem Context and Objectives

The central question tackled by the paper is the synthesis of CBFs through data-driven means, primarily leveraging expert demonstrations. CBFs are instrumental in certifying the forward invariance of safe sets in dynamical systems, thus ensuring safety. Traditional approaches, such as analytical or sum-of-squares based methods, while offering robust guarantees, are often limited in terms of scalability and applicability to broader classes of nonlinear systems. This research aims to fill that gap by introducing a learning framework that is agnostic to specific parameterizations of the CBFs, provided that the Lipschitz constants of the functions can be bounded efficiently.

Methodology

The authors present a structured approach to learning the CBF:

Data Collection: They define the criteria for constructing datasets from expert demonstrations and synthetic unsafe samples. The expert data ensures that the learned CBF aligns with real-world, safe driving behavior.
Optimization Problem: The learning of CBFs is cast as an optimization problem. The authors derive constraints that are sufficient to ensure the validity of the learned CBF on a sampled subset of the state space. These ensure the CBF encapsulates both the notion of safety and Lipschitz continuity.
Feasibility and Scalability: The problem's feasibility is contingent on the ability to sample the state space densely and bound the Lipschitz constant appropriately. The optimization can be efficiently solved using numerical techniques for classes like Reproducing Kernel Hilbert Spaces (RKHS) and Deep Neural Networks (DNNs).

Theoretical Contributions

The paper makes theoretical contributions concerning the sufficient conditions under which the learned CBF maintains the system's safety. By introducing conditions related to the Lipschitz continuity of CBF and the derivative constraints, the authors provide a framework that guarantees not only the existence but also the efficiency of learning a valid CBF via data.

Experimental Validation

Empirical results are provided through two primary experiments:

Planar System: The authors demonstrate the learning and validation process on a simple two-dimensional system. They highlight the ability of the learned CBF to replicate and ensure safe behavior through trajectory tracking and safety maintenance.
Aircraft Collision Avoidance: A more complex scenario involving collision avoidance for two aircraft is explored, comparing the learned CBF against an analytical solution. This experiment underlines the paper’s capability to scale to complex, higher-dimensional systems.

Implications and Future Directions

Practically, this approach enables the synthesis of CBFs for complex systems where human demonstrations or pre-collected data is available, thus facilitating safer control for applications like autonomous driving and robotics. Theoretically, it opens avenues for utilizing learning techniques, such as reinforcement learning, in conjunction with barrier functions to handle more uncertain dynamical environments. Future work could enhance this methodology by reducing sample complexity and extending it to partially unknown or stochastic dynamics, potentially applying statistical learning theory to refine the safety guarantees across a broader range of unseen scenarios.

In summary, the paper contributes a significant advancement in control systems, blending traditional safety guarantees of CBFs with data-driven learning approaches, hence offering a scalable path forward for safe autonomous system design.

PDF Markdown

Related Papers

GitHub

GitHub - unstable-zeros/learning-cbfs: Code needed to reproduce the examples found in "Learning Control Barrier Functions from Expert Demonstrations," by A. Robey, H. Hu, L. Lindemann, H. Zhang, D. V. Dimarogonas, S. Tu, and N. Matni, https://arxiv.org/abs/2004.03315 (68 stars)