QUICK REVIEW

[论文解读] Automatic Perturbation Analysis for Scalable Certified Robustness and Beyond

Kaidi Xu, Zhouxing Shi|arXiv (Cornell University)|Feb 28, 2020

Adversarial Robustness in Machine Learning参考文献 53被引用 92

一句话总结

本文提出一个自动化的 LiRPA 框架，将扰动分析推广到任意神经网络图，实现在大型架构和数据集上的可扩展、可微分且支持损失融合的认证鲁棒性。

ABSTRACT

Linear relaxation based perturbation analysis (LiRPA) for neural networks, which computes provable linear bounds of output neurons given a certain amount of input perturbation, has become a core component in robustness verification and certified defense. The majority of LiRPA-based methods focus on simple feed-forward networks and need particular manual derivations and implementations when extended to other architectures. In this paper, we develop an automatic framework to enable perturbation analysis on any neural network structures, by generalizing existing LiRPA algorithms such as CROWN to operate on general computational graphs. The flexibility, differentiability and ease of use of our framework allow us to obtain state-of-the-art results on LiRPA based certified defense on fairly complicated networks like DenseNet, ResNeXt and Transformer that are not supported by prior works. Our framework also enables loss fusion, a technique that significantly reduces the computational complexity of LiRPA for certified defense. For the first time, we demonstrate LiRPA based certified defense on Tiny ImageNet and Downscaled ImageNet where previous approaches cannot scale to due to the relatively large number of classes. Our work also yields an open-source library for the community to apply LiRPA to areas beyond certified defense without much LiRPA expertise, e.g., we create a neural network with a probably flat optimization landscape by applying LiRPA to network parameters. Our opensource library is available at https://github.com/KaidiXu/auto_LiRPA.

研究动机与目标

开发一个自动化的扰动分析框架（LiRPA），能够在超越前馈网络的一般计算图上工作。
实现可微、可扩展的界限，适用于对复杂架构进行认证鲁棒性训练。
引入损失融合以降低 LiRPA 的计算成本，使其能够在大规模数据集和大量类别上进行训练。
展示其在鲁棒性之外的适用性，包括超出 Lp-ball 的扰动和参数空间分析。
提供一个开源库，用于在无需特定架构推导的情况下应用 LiRPA。

提出的方法

通过对有向无环图（DAG）上的前向和后向界限传播，将 LiRPA 推广到通用计算图。
将前向 LiRPA 预言 G_i 定义为从输入界限为依赖节点计算界限。
将后向 LiRPA 预言 F_i 定义为将输出界限传播到前驱并推导 A_i 矩阵。
用动态规划将任意扰动具体化为线性边界，包括 Lp-ball 和基于同义词的单词替换。
引入损失融合，以直接在损失/对数输出上计算紧界限，减少对类别数量的依赖。
提供一个开源库 auto_LiRPA，用于将 LiRPA 应用于多样模型和扰动设定。

实验结果

研究问题

RQ1是否可以在无需手工推导的情况下，自动为任意神经网络架构推导出 LiRPA 边界？
RQ2如何结合前向与后向 LiRPA 传播，为通用图上的单个输出节点得到可证明的界限？
RQ3损失融合是否足以降低 LiRPA 的计算负担，使其能够扩展到具有大量类别的大型数据集？
RQ4基于 LiRPA 的认证防御是否能够在复杂架构（DenseNet、ResNeXt、Transformers）和大型数据集（Tiny ImageNet、Downscaled ImageNet）上有效训练？
RQ5扰动分析是否能够超越传统的 Lp-ball 输入，扩展到其他扰动类型（例如同义词替换）甚至参数空间分析？

主要发现

该框架在 CIFAR-10 上，ε=8/255 时实现了最先进的验证防御结果，报告的验证错误率为 66.62%。
它使在 DenseNet、ResNeXt、Transformer 架构上进行基于 LiRPA 的认证防御训练成为可能，之前由于需要手动推导而不支持。
损失融合将 LiRPA 训练成本降低到仅比在 CIFAR-10 和 Tiny ImageNet 上的自然训练慢 3-4 倍，从而使大规模标签数量（200/1000）变得可扩展。
首次在 Tiny ImageNet 和 Downscaled ImageNet 上实现基于 LiRPA 的认证防御，证明其对具有数百到数千类别的数据集的可扩展性。
通过在模型参数上进行扰动分析并提供非 Lp 扰动（如基于同义词的 NLP 扰动）框架，展示了 LiRPA 的更广泛应用。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。