QUICK REVIEW

[论文解读] DeepTest: Automated Testing of Deep-Neural-Network-driven Autonomous Cars

Yuchi Tian, Kexin Pei|arXiv (Cornell University)|Aug 28, 2017

Adversarial Robustness in Machine Learning参考文献 53被引用 220

一句话总结

DeepTest 通过对图像进行变换以最大化神经元覆盖率，自动合成真实测试输入，并使用变形关系来检测自主驾驶模型中的错误 DNN 行为。

ABSTRACT

Recent advances in Deep Neural Networks (DNNs) have led to the development of DNN-driven autonomous cars that, using sensors like camera, LiDAR, etc., can drive without any human intervention. Most major manufacturers including Tesla, GM, Ford, BMW, and Waymo/Google are working on building and testing different types of autonomous vehicles. The lawmakers of several US states including California, Texas, and New York have passed new legislation to fast-track the process of testing and deployment of autonomous vehicles on their roads. However, despite their spectacular progress, DNNs, just like traditional software, often demonstrate incorrect or unexpected corner case behaviors that can lead to potentially fatal collisions. Several such real-world accidents involving autonomous cars have already happened including one which resulted in a fatality. Most existing testing techniques for DNN-driven vehicles are heavily dependent on the manual collection of test data under different driving conditions which become prohibitively expensive as the number of test conditions increases. In this paper, we design, implement and evaluate DeepTest, a systematic testing tool for automatically detecting erroneous behaviors of DNN-driven vehicles that can potentially lead to fatal crashes. First, our tool is designed to automatically generated test cases leveraging real-world changes in driving conditions like rain, fog, lighting conditions, etc. DeepTest systematically explores different parts of the DNN logic by generating test inputs that maximize the numbers of activated neurons. DeepTest found thousands of erroneous behaviors under different realistic driving conditions (e.g., blurring, rain, fog, etc.) many of which lead to potentially fatal crashes in three top performing DNNs in the Udacity self-driving car challenge.

研究动机与目标

为 DNN 驱动的自动驾驶汽车提供安全关键性测试的动机，并强调人工数据收集的局限性。
引入神经元覆盖率作为探索 DNN 输入-输出空间的引导信号。
开发方法以合成真实的、基于变换的测试输入，从而扩大神经元覆盖率。
提出将变形关系作为自动测试 oracle，以检测错误的边缘情形行为。
在 Udacity 自驾模型上评估该方法并公开发布检测到的案例。

提出的方法

将神经元覆盖率定义为激活神经元与总神经元之比，并用它来划分 DNN 输入空间。
通过对种子图像应用图像变换（亮度/对比度、模糊、降雨、雾、平移、旋转、缩放、错切）来生成真实的合成测试输入。
提出一种以神经元覆盖率为引导的贪心搜索，用于组合多种变换以最大化覆盖率。
在变换后输入的输出之间使用变形关系来自动检测错误行为，而无需详细的人工规范。
在三个 Udacity 顶尖模型（Chauffeur、Rambo、Epoch）上使用 Keras/TensorFlow 主干实现 DeepTest，并评估结果。

实验结果

研究问题

RQ1神经元覆盖率是否与自动驾驶汽车的输出（转向角度和方向）相关？
RQ2不同的真实图像变换是否会激活 DNN 中的不同神经元？
RQ3结合多种变换是否可以进一步提高覆盖率并揭示更多边缘情形？
RQ4变形关系是否能作为检测变换输入下错误行为的有效测试 oracle？

主要发现

神经元覆盖率会随不同输入输出对而变化，在跨模型中与转向角度和方向显示出统计显著的相关性。
不同的图像变换激活不同的神经元，变换通常也会提高跨模型的神经元覆盖率。
组合变换进一步增加覆盖率，支持以神经元覆盖为引导的测试输入合成搜索。
DeepTest 在 Udacity 三个顶尖 DNN 模型中发现了数千个错误行为，包括在现实条件如降雨、雾和模糊下的潜在致命情景。
合成测试图像可用于再训练以提升 DNN 的鲁棒性，作者提供检测到的错误行为的公开访问，并计划发布测试图像与 DeepTest 源代码。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。