Patching Approximately Safe Value Functions Leveraging Local Hamilton-Jacobi Reachability Analysis

h2 class="title is-3">Abstract

Learning-based control algorithms have led to major advances in robotics at the cost of decreased safety guarantees. Recently, neural networks have also been used to characterize safety through the use of barrier functions for complex nonlinear systems. Learned barrier functions approximately encode and enforce a desired safety constraint through a value function, but do not provide any formal guarantees. In this paper, we propose a local dynamic programming (DP) based approach to "patch" an almost-safe learned barrier at potentially unsafe points in the state space. This algorithm, HJ-Patch, obtains a novel barrier that provides formal safety guarantees, yet retains the global structure of the learned barrier. Our local DP based reachability algorithm, HJ-Patch, updates the barrier function "minimally" at points that both (a) neighbor the barrier safety boundary and (b) do not satisfy the safety condition. We view this as a key step to bridging the gap between learning-based barrier functions and Hamilton-Jacobi reachability analysis, providing a framework for further integration of these approaches. We demonstrate that for well-trained barriers we reduce the computational load by 2 orders of magnitude with respect to standard DP-based reachability, and demonstrate scalability to a 6-dimensional system, which is at the limit of standard DP-based reachability.

Method

We assume we are given an initial set of potentially unsafe states and an initial value function.

Over time, we update the value function and associated active set of potentially unsafe states until the active set is empty. This process is visualized in the top half of the figure.

The bottom half shows a zoomed-in view of how we update the value function on the active set of states. Next, the active set for the next iteration consists of the states that are neighbors of the states that decreased in value and are near the boundary. Newly active states are in dark blue, and removed states are in light blue.

Visualization of the algorithm

For the 4-dimensional vertical quadrotor, we learn an expert policy and its associated almost-barrier. However, on the (0, 0) velocity and angular velocity slice the almost-barrier requires extensive patching using HJ-Patch as we see below. At each iteration, our algorithm selects a subset of the boundary states to update and performs standard reachability on these states. The safety guarantees hold independently of the amount of patching that has to be performed.

Likewise, for the 6-dimensional planar quadrotor, we learn an expert policy and its associated almost-barrier. At moderately high positive velocities and angular velocities, the x-y plane slice requires a moderate amount of patching to retain safety. Due to our 1-1 comparison with global reachability, we show the results for low grid density, where we observe oscillations in the active set. At convergence, the patched barrier function has a higher level of safety than the global value function, likely because the low grid density of both methods causes some numerical instability.

Qualitative Results

Upon convergence, the value function obtained with HJ-Patch largely overlaps with the global reachability solution, while only requiring a fraction of the computational load. We provide a direct comparison on the left figure, where the HJ-Patch is portrayed in blue and the global reachability solution is in green. Beyond obtaining a slightly larger safe set (for most slices), our method does have one clear drawback, in not providing an exponential return to safety. This is observed on the right figure, where although HJ-Patch and global reachability have approximately the same share of violations, the unsafe trajectories with the patched CBF do not attempt to return to the safe set. However we can take the signed distance function to the patched 0-level set as the new barrier function and choose an appropriate discount factor to guarantee safety.

Quantitative Results

To quantify the safety of the different barrier functions, we perform random simulations of the system with a nominal controller. Additionally, to quantify the computational speed-up of our algorithm, HJ-Patch, we quantify the total number of value updates performed over the entire grid (and compare it to global HJ reachability).

The results demonstrate the need for patching barriers, and we demonstrate that we can reduce the computational load by up to 2 orders of magnitude with respect to standard DP-based reachability, with approximately the same safety guarantees.

4-dimensional vertical quadrotor

6-dimensional planar quadrotor

Citation

If you use our method or code in your researhc, please consider citing the paper as follows:

@inproceedings{TonkensToofanianEtAl2024,
  author    = {Tonkens, S. and Toofanian, A. and Qin, Z. and Gao, S. and Herbert, S.},
  title     = {Patching Approximately Safe Value Functions Leveraging Local Hamilton-Jacobi Reachability Analysis},
  booktitle = {IEEE Conference on Decision and Control} 
  year      = {2024},
}