QUICK REVIEW

[论文解读] On the stability analysis of optimal state feedbacks as represented by deep neural models.

Dario Izzo, Dharmesh Tailor|arXiv (Cornell University)|Dec 6, 2018

Model Reduction and Neural Networks被引用 5

一句话总结

本文提出了一种新颖的方法，利用微分代数和自动微分技术，对基于深度神经网络的控制系统（特别是制导与控制网络，G&CNETs）进行形式化稳定性验证。该方法可提供对扰动和时滞的鲁棒性理论保证，填补了在航空航天和汽车等安全关键应用中蒙特卡洛验证不足的关键空白。

ABSTRACT

Research has shown how the optimal feedback control of several non linear systems of interest in aerospace applications can be represented by deep neural architectures and trained using techniques including imitation learning, reinforcement learning and evolutionary algorithms. Such deep architectures are here also referred to as Guidance and Control Networks, or G&CNETs. It is difficult to provide theoretical proofs on the control stability of such neural control architectures in general, and G&CNETs in particular, to perturbations, time delays or model uncertainties or to compute stability margins and trace them back to the network training process or to its architecture. In most cases the analysis of the trained network is performed via Monte Carlo experiments and practitioners renounce to any formal guarantee. This lack of validation naturally leads to scepticism especially in cases where safety and validation are of paramount importance such as is the case, for example, in the automotive or space industry. In an attempt to narrow the gap between deep learning research and control theory, we propose a new methodology based on differential algebra and automated differentiation to obtain formal guarantees on the behaviour of neural based control systems.

研究动机与目标

解决基于深度神经网络的控制系统缺乏形式化稳定性保证的问题，特别是在航空航天和汽车工程等安全关键领域。
克服蒙特卡洛仿真等经验验证方法的局限性，这些方法无法提供理论上的鲁棒性保障。
通过实现对神经控制架构的形式化分析，弥合深度学习研究与控制理论之间的鸿沟。
实现将稳定性裕度追溯至网络架构和训练过程，适用于G&CNETs。
提供一个系统化框架，用于评估在学习到的最优反馈控制器中对扰动、时滞和模型不确定性的鲁棒性。

提出的方法

应用微分代数技术，将基于深度神经网络的控制器动态建模为符号表达式。
利用自动微分技术，精确计算神经控制策略相对于状态和时间的梯度与雅可比矩阵。
基于G&CNET架构的符号导数，采用类似李雅普诺夫的分析方法制定稳定性条件。
将符号模型与控制理论的稳定性准则相结合，推导出鲁棒性裕度的形式化边界。
通过敏感性分析，将稳定性特性追溯至网络架构超参数和训练数据分布。
在具有代表性的非线性航空航天控制问题上验证该框架，以证明其可行性和准确性。

实验结果

研究问题

RQ1能否在不依赖蒙特卡洛仿真的情况下，为基于深度神经网络的最优反馈控制器推导出形式化稳定性保证？
RQ2如何利用微分代数和自动微分技术分析G&CNETs对扰动和时滞的鲁棒性？
RQ3稳定性裕度在多大程度上可以追溯至G&CNETs的网络架构和训练过程？
RQ4是否可能为非线性航空航天应用中基于深度学习的控制系统建立理论鲁棒性边界？
RQ5与纯经验验证技术相比，该方法在可靠性与洞察力方面表现如何？

主要发现

所提出的方法实现了对基于深度神经网络的控制系统的稳定性形式化验证，克服了对经验性蒙特卡洛测试的依赖。
微分代数和自动微分技术可实现系统敏感性的精确计算，这对于推导稳定性条件至关重要。
稳定性裕度可被解析地关联至架构和训练参数，从而实现对网络设计的针对性优化。
该框架为评估非线性系统中时滞和模型不确定性下的鲁棒性提供了一套系统化方法。
该方法适用于实际的航空航天控制问题，为学习到的控制策略认证提供了可行路径。
该方法在具有代表性的非线性控制场景中展现出可行性，表明其有潜力集成至安全关键系统开发流程中。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。