QUICK REVIEW

[论文解读] SafeBench: A Benchmarking Platform for Safety Evaluation of Autonomous Vehicles

Chejian Xu, Wenhao Ding|arXiv (Cornell University)|Jun 20, 2022

Autonomous Vehicle Technology and Safety被引用 22

一句话总结

SafeBench 是一个统一平台，在8个安全关键场景、4种场景生成算法、每个场景10种路线变体、以及4种深度强化学习自动驾驶算法下共同评估自动驾驶代理，使用10个评估指标。

ABSTRACT

As shown by recent studies, machine intelligence-enabled systems are vulnerable to test cases resulting from either adversarial manipulation or natural distribution shifts. This has raised great concerns about deploying machine learning algorithms for real-world applications, especially in safety-critical domains such as autonomous driving (AD). On the other hand, traditional AD testing on naturalistic scenarios requires hundreds of millions of driving miles due to the high dimensionality and rareness of the safety-critical scenarios in the real world. As a result, several approaches for autonomous driving evaluation have been explored, which are usually, however, based on different simulation platforms, types of safety-critical scenarios, scenario generation algorithms, and driving route variations. Thus, despite a large amount of effort in autonomous driving testing, it is still challenging to compare and understand the effectiveness and efficiency of different testing scenario generation algorithms and testing mechanisms under similar conditions. In this paper, we aim to provide the first unified platform SafeBench to integrate different types of safety-critical testing scenarios, scenario generation algorithms, and other variations such as driving routes and environments. Meanwhile, we implement 4 deep reinforcement learning-based AD algorithms with 4 types of input (e.g., bird's-eye view, camera) to perform fair comparisons on SafeBench. We find our generated testing scenarios are indeed more challenging and observe the trade-off between the performance of AD agents under benign and safety-critical testing scenarios. We believe our unified platform SafeBench for large-scale and effective autonomous driving testing will motivate the development of new testing scenario generation and safe AD algorithms. SafeBench is available at https://safebench.github.io.

研究动机与目标

促进对自动驾驶（AD）系统的稳健、可扩展的安全评估，超越传统的现实世界里程测试。
提供一个统一的平台，在相同条件下比较不同的情景生成方法。
在多样的安全关键场景、路线和传感输入下，公平评估 AD 算法。
量化 AD 性能中的安全性、功能性和礼仪之间的权衡。

提出的方法

通过一个模块化的4节点平台（具体节点：自我车辆 Ego、代理 Agent、情景 Scenario、评估 Evaluation）将8个 NHTSA 指定的安全关键驾驶场景整合到 CARLA 仿真中。
通过对每个场景应用4种情景生成算法、以及每个场景10条驾驶路线，生成2,352个安全关键场景。
实现4种基于深度强化学习的 AD 算法，使用4种输入类型，以在多样的感知能力下评估性能。
使用涵盖安全、功能性和礼仪三个层级的10项指标评估 AD 代理，并计算总体加权分数 OS。
支持基于对抗的对手情景生成和基于知识的情景生成，以研究鲁棒性和安全性。

实验结果

研究问题

RQ1在统一测试条件下，不同的情景生成算法如何影响自动驾驶代理的安全性与性能？
RQ2在多种输入方式下，良性场景与安全关键场景之间的 AD 性能有何取舍？
RQ3哪些安全关键场景与生成方法在不同的 AD 算法之间具有最佳可迁移性？
RQ4不同的评估指标（安全性、功能性、礼仪）如何结合以反映整体的 AD 安全性和性能？

主要发现

生成的场景更具挑战性，并揭示在良性与安全关键条件下 AD 性能之间的取舍。
某些安全关键场景在不同的 AD 算法之间具有良好迁移性，而其他场景则更具算法特异性。
不同的情景生成算法产生不同的有效性；对抗性生成的场景可能产生更高的碰撞与风险率。
在安全关键测试中，PPO 往往能取得最佳 OS，而其他代理在不同指标上表现出色；观察到强烈的安全-功能权衡。
CS（Carla Scenario Generator）在AD代理之间显示出高可迁移性，而 Adversarial Trajectory Optimization 在筛选后可能产生高碰撞率。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。