QUICK REVIEW

[论文解读] Synthesis of safe controller via supervised learning for truck lateral control.

Yuxiao Chen, Ayonga Hereid|arXiv (Cornell University)|Dec 15, 2017

Vehicle Dynamics and Control Systems参考文献 24被引用 4

一句话总结

本文提出了一种用于铰接式卡车横向控制的混合控制器，结合监督学习与控制屏障函数（CBF），在保证安全的前提下实现高性能。该方法在CBF约束下优化的轨迹数据集上通过监督学习训练策略，随后利用基于CBF的监督器确保安全性，从而在车道保持的案例研究中实现监督器的最小干预。

ABSTRACT

Correct-by-construction techniques such as control barrier function (CBF) have been developed to guarantee safety for control systems as supervisory controller. However, when the supervisor intervenes, the performance is typically compromised. On the other hand, machine learning is used to synthesize controllers that inherit good properties from the training data, but safety is typically not guaranteed due to the difficulty of analysis. In this paper, supervised learning is combined with CBF to synthesize controllers that enjoy good performance with safety guarantee. First, a training set is generated by trajectory optimization that incorporates the CBF constraint for multiple initial conditions. Then a policy is trained via supervised learning that maps the feature representing the initial condition to a parameterized desired trajectory. Finally, the learning based controller is used as the student controller, and a CBF based supervisory controller on top of that guarantees safety. A case study of lane keeping for articulated trucks shows that the student controller trained by the supervised learning inherits the good performance of the training set and the CBF supervisor never or rarely intervenes.

研究动机与目标

解决自动驾驶车辆控制中性能与安全之间的权衡问题，特别是针对铰接式卡车等复杂系统。
克服纯基于学习的控制器缺乏安全保证的局限性，以及纯CBF监督器在干预时性能下降的问题。
开发一种框架，利用学习控制器的高性能，同时通过基于CBF的监督层确保安全性。
在真实控制场景——铰接式卡车的车道保持中，验证该方法的有效性。

提出的方法

通过求解包含控制屏障函数（CBF）约束的轨迹优化问题，为多种初始条件生成训练数据集。
训练一个监督策略，将初始状态特征映射到参数化的期望轨迹，从优化轨迹中学习。
将训练好的策略作为学生控制器部署，旨在跟踪学习到的期望轨迹。
在学生控制器之上实现基于CBF的监督控制器，实时强制执行安全约束。
确保监督器仅在学生控制器的轨迹违反安全条件时才进行干预。
利用CBF框架在数学上保证安全性，同时允许学生控制器在安全时自由运行。

实验结果

研究问题

RQ1基于监督学习的控制器能否从CBF约束最优轨迹的数据集中继承高性能？
RQ2当学生控制器在闭环运行时，基于CBF的监督器在多大程度上减少了干预频率？
RQ3将监督学习与CBF结合是否能在保持安全保证的同时，提升传统仅使用CBF方法的性能？
RQ4所提出的方法在真实世界的横向控制任务（如铰接式卡车的车道保持）中表现如何？

主要发现

通过监督学习训练的学生控制器成功复现了训练过程中使用的优化轨迹的性能特征。
基于CBF的监督控制器在车道保持操作中仅在极少数情况下或完全不进行干预，表明其具有高可靠性且性能下降极小。
学习与CBF的结合确保了即使控制器运行在可行轨迹边界的附近，安全性也得以保持。
该方法表明，通过引入CBF监督器，基于学习的控制器可以在不牺牲安全性的前提下实现高性能。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。