Skip to main content
QUICK REVIEW

[Paper Review] Neural Certificates for Safe Control Policies

Wanxin Jin, Zhaoran Wang|arXiv (Cornell University)|Jun 15, 2020
Fault Detection and Control Systems18 references43 citations
TL;DR

The paper proposes jointly learning a policy with neural barrier and Lyapunov-like certificates to guarantee safety and goal-reaching for dynamical systems, demonstrated on pendulums, cart-poles, vehicle path tracking, and UAVs.

ABSTRACT

This paper develops an approach to learn a policy of a dynamical system that is guaranteed to be both provably safe and goal-reaching. Here, the safety means that a policy must not drive the state of the system to any unsafe region, while the goal-reaching requires the trajectory of the controlled system asymptotically converges to a goal region (a generalization of stability). We obtain the safe and goal-reaching policy by jointly learning two additional certificate functions: a barrier function that guarantees the safety and a developed Lyapunov-like function to fulfill the goal-reaching requirement, both of which are represented by neural networks. We show the effectiveness of the method to learn both safe and goal-reaching policies on various systems, including pendulums, cart-poles, and UAVs.

Motivation & Objective

  • Motivate the need for safety and goal-reaching in policy learning for dynamical systems.
  • Define safety and goal-reaching precisely and distinguish them from stability and optimality.
  • Develop neural network certificates (barrier and Lyapunov-like) to certify safety and convergence.
  • Jointly learn a policy and certificates, and validate on multiple nonlinear systems.

Proposed method

  • Represent the barrier function B(x) as a neural network with differentiable output.
  • Represent the Lyapunov-like function V(x) as a neural network (nonnegative via a quadratic form).
  • Define barrier/ Lyapunov-like certificate losses that encode the three barrier conditions and Lyapunov conditions.
  • Jointly optimize the neural policy and certificate networks to minimize the total certificate loss.
  • Include a verification step to check the learned certificates against discretized state samples.
  • Apply the method to nonlinear systems including pendulum, cart-pole, vehicle path tracking, and UAVs.

Experimental results

Research questions

  • RQ1Can a policy be made safe with respect to an unsafe set while also being goal-reaching to a target set?
  • RQ2Can barrier and Lyapunov-like certificates be learned jointly with a neural policy to guarantee safety and convergence?
  • RQ3How can learned certificates be verified to ensure they meet the theoretical conditions?
  • RQ4How does the approach perform across diverse nonlinear systems in practice?

Key findings

  • Joint learning of barrier and Lyapunov-like certificates with a neural policy yields safe and goal-reaching behavior.
  • Using Lyapunov-like certificates alone achieves goal-reaching but may violate safety; barrier+Lyapunov-like certificates fix safety.
  • The method is demonstrated across pendulum, cart-pole, vehicle path tracking, and UAV control tasks with successful safety guarantees.
  • Verification steps accompany learning to validate the certificate properties on discretized state sets.
  • Empirical results show the learned certificates provide provable safety and convergence guarantees in the tested scenarios.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.