QUICK REVIEW

[论文解读] The Second International Verification of Neural Networks Competition (VNN-COMP 2021): Summary and Results

Stanley Bak, Changliu Liu|arXiv (Cornell University)|Aug 31, 2021

Adversarial Robustness in Machine Learning参考文献 51被引用 43

一句话总结

简要：一份全面的报告，详细介绍 VNN-COMP 2021，包括规则、参与工具、基准、结果，以及来自公平、标准化神经网络验证竞赛的经验教训。

ABSTRACT

This report summarizes the second International Verification of Neural Networks Competition (VNN-COMP 2021), held as a part of the 4th Workshop on Formal Methods for ML-Enabled Autonomous Systems that was collocated with the 33rd International Conference on Computer-Aided Verification (CAV). Twelve teams participated in this competition. The goal of the competition is to provide an objective comparison of the state-of-the-art methods in neural network verification, in terms of scalability and speed. Along this line, we used standard formats (ONNX for neural networks and VNNLIB for specifications), standard hardware (all tools are run by the organizers on AWS), and tool parameters provided by the tool authors. This report summarizes the rules, benchmarks, participating tools, results, and lessons learned from this competition.

研究动机与目标

建立一个公平、标准化的平台，用以比较神经网络验证工具。
使用 ONNX/VNNLIB 格式和 AWS 硬件评估最先进验证方法的可扩展性和速度。
提供覆盖多样化体系结构和应用的基准，以推动神经网络验证领域的进展。
总结经验教训，指导 VNN-COMP 的未来迭代。

提出的方法

对所有工具使用标准化输入（ONNX 网络、VNNLIB 规范）和硬件（AWS CPU/GPU）。
定义每个实例和每个基准的运行时上限以及一个开销校正过程，以确保公平计时。
实现了基于实例正确性、正确性类型以及基于时间的奖励的计分方案。
收集并分析了12支参赛队伍/工具在固定基准集上的结果。
通过 GitHub 提供可重复的管线以及公开可用的基准/脚本。

实验结果

研究问题

RQ1在标准化条件下，当前的 NN 验证工具在可扩展性和速度方面如何比较？
RQ2在多样化基准集中，每个工具的优点与局限性是什么？
RQ3可以从中学到哪些经验，以改进未来的 VNN-COMP 迭代和 NN 验证研究？
RQ4标准格式和受控硬件如何影响对验证方法的公平比较？

主要发现

十二个工具参与并在一个共用的 AWS 基础平台上评估。
基准覆盖多样的网络和任务，包括 ACASXu、CIFAR-10 变体、MNIST 等（见基准列表）。
竞赛建立了一个公平、可重复的流程，包含预定义规则、超时以及用于纠正工具启动时间的开销校正。
结果和基准结果公开报告，便于客观比较和未来的复现。
报告记录了经验教训以及对后续 VNN-COMP 迭代的潜在改进。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。