QUICK REVIEW

[论文解读] The Lyapunov Neural Network: Adaptive Stability Certification for Safe Learning of Dynamical Systems

Spencer M. Richards, Felix Berkenkamp|arXiv (Cornell University)|Aug 2, 2018

Fault Detection and Control Systems被引用 67

一句话总结

作者提出一种基于神经网络的 Lyapunov 函数，能适应非线性闭环系统的最大安全区域，在学习过程中无需依赖固定模型结构即可实现可证明的安全。

ABSTRACT

Learning algorithms have shown considerable prowess in simulation by allowing robots to adapt to uncertain environments and improve their performance. However, such algorithms are rarely used in practice on safety-critical systems, since the learned policy typically does not yield any safety guarantees. That is, the required exploration may cause physical harm to the robot or its environment. In this paper, we present a method to learn accurate safety certificates for nonlinear, closed-loop dynamical systems. Specifically, we construct a neural network Lyapunov function and a training algorithm that adapts it to the shape of the largest safe region in the state space. The algorithm relies only on knowledge of inputs and outputs of the dynamics, rather than on any specific model structure. We demonstrate our method by learning the safe region of attraction for a simulated inverted pendulum. Furthermore, we discuss how our method can be used in safe learning algorithms together with statistical models of dynamical systems.

研究动机与目标

在学习驱动的机器人领域激发安全性，并为给定策略识别最大的安全区域（ROA）。
开发一个神经网络Lyapunov候选函数，使其本身天然产生安全性证书。
训练Lyapunov网络以塑形其等高线，使之与真实的ROA相匹配，而无需假设特定的动力学模型。
将该方法在非线性系统（倒立摆）上进行演示，并讨论与安全学习框架的整合。

提出的方法

构造Lyapunov候选函数 v_theta(x) = phi_theta(x)^T phi_theta(x)，其中 phi_theta 是具有结构保证的前馈神经网络。
通过在网络各层引入平凡零空间并使用具有平凡零空间的激活函数，确保 v_theta 的正定性和李氏连续性。
通过将安全集估计公式化为分类问题来训练 v_theta：若 x 属于真实的 ROA S_pi，则 y = +1；否则为 y = -1；当 v_theta(x) < c_S 时表示安全。
对安全区域内的状态施加Lyapunov下降条件 Delta v_theta(x) < 0，在训练中使用拉格朗日形式对违反（Delta v_theta）进行惩罚。
通过算法1从已知安全集出发，扩展前向仿真间隙并更新 theta，迭代地扩展安全水平集以扩大认证的ROA。
将该方法与SOS Lyapunov函数联系起来，并讨论通过离散采样和 Lipschitz 值界来进行安全性验证。

实验结果

研究问题

RQ1如何构建一个神经网络，使其成为对非线性、不确定的闭环动力学的可证明Lyapunov函数？
RQ2学习得到的Lyapunov函数能否在不依赖固定多项式/SOS结构的情况下，适应其等高线形状以接近真实的吸引域？
RQ3如何利用分类概念来训练基于Lyapunov的安全证书，以证明最大的安全区域？
RQ4如何利用安全证书实现非线性动力系统的安全探索和学习？
RQ5在非线性基准测试（倒立摆）上的可行性和性能如何，且与现有方法相比如何？

主要发现

该方法产生的神经Lyapunov候选函数具有正定性和李氏连续性，从而能够提供可证明的安全证书。
通过分类式训练可以将Lyapunov等高线塑形为与真实 ROA 相匹配，同时保持下降条件。
算法1通过扩展水平集并通过前向仿真验证安全性，展示了安全区域的迭代增长。
该方法可以使用计算得到的Lyapunov函数至少对真实ROA的一部分进行认证，保证不会把不安全状态错误地分类为安全。
应用于倒立摆表明该方法能够学习非线性系统的安全吸引区域，并讨论与安全学习框架的整合。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。