QUICK REVIEW

[论文解读] Fast Certified Robust Training via Better Initialization and Shorter Warmup.

Zhouxing Shi, Yihan Wang|arXiv (Cornell University)|Mar 31, 2021

Adversarial Robustness in Machine Learning被引用 4

一句话总结

该论文通过引入一种新型的IBP专用权重初始化方法以及在预热阶段采用自适应正则化，提出了一种更快的认证鲁棒训练方法，在显著缩短训练时间的同时提升了鲁棒准确率。在CIFAR-10上（$epsilon=8/255$）实现了65.03%的SOTA验证误差，在TinyImageNet上（$epsilon=1/255$）实现了82.13%的SOTA验证误差，分别仅使用160和80个训练轮次，优于以往需要数百甚至数千轮次的方法。

ABSTRACT

Recently, bound propagation based certified adversarial defense have been proposed for training neural networks with certifiable robustness guarantees. Despite state-of-the-art (SOTA) methods including interval bound propagation (IBP) and CROWN-IBP have per-batch training complexity similar to standard neural network training, to reach SOTA performance they usually need a long warmup schedule with hundreds or thousands epochs and are thus still quite costly for training. In this paper, we discover that the weight initialization adopted by prior works, such as Xavier or orthogonal initialization, which was originally designed for standard network training, results in very loose certified bounds at initialization thus a longer warmup schedule must be used. We also find that IBP based training leads to a significant imbalance in ReLU activation states, which can hamper model performance. Based on our findings, we derive a new IBP initialization as well as principled regularizers during the warmup stage to stabilize certified bounds during initialization and warmup stage, which can significantly reduce the warmup schedule and improve the balance of ReLU activation states. Additionally, we find that batch normalization (BN) is a crucial architectural element to build best-performing networks for certified training, because it helps stabilize bound variance and balance ReLU activation states. With our proposed initialization, regularizers and architectural changes combined, we are able to obtain 65.03% verified error on CIFAR-10 ($\epsilon=\frac{8}{255}$) and 82.13% verified error on TinyImageNet ($\epsilon=\frac{1}{255}$) using very short training schedules (160 and 80 total epochs, respectively), outperforming literature SOTA trained with a few hundreds or thousands epochs.

研究动机与目标

为减少现有认证对抗训练方法所需的漫长预热阶段，这些方法尽管每批次计算复杂度相似，但计算成本高昂。
解决由Xavier或正交初始化等标准权重初始化方法导致的初始阶段认证边界过松的问题。
缓解基于IBP的训练中ReLU激活状态失衡的问题，该问题会阻碍模型性能。
识别并利用批量归一化作为关键的架构组件，以稳定边界方差并改善ReLU激活的平衡性。
开发一个完整的训练流程，实现在极短训练时间内达到最先进的认证鲁棒性。

提出的方法

提出一种基于认证边界传播理论分析的新型IBP专用权重初始化方案，旨在从训练开始就收紧边界。
在预热阶段引入有原则的正则化项，以稳定认证边界并减少早期训练迭代中的方差。
将批量归一化作为关键的架构组件，以平衡ReLU激活状态并稳定各层之间认证边界的方差。
通过改进初始化和正则化策略，缩短预热阶段，优化训练调度，实现在更少轮次内收敛。
采用区间边界传播（IBP）作为认证鲁棒性框架，并通过改进权重初始化和训练动态以提升效率。
结合架构层面与优化层面的改进，实现在极短训练时间内获得高认证鲁棒性。

实验结果

研究问题

RQ1为何标准权重初始化方法在基于IBP的训练中会导致初始阶段认证边界过松？
RQ2ReLU激活状态失衡如何影响IBP训练中的认证鲁棒性？是否可以被缓解？
RQ3是否可以通过专门设计的初始化与正则化策略，显著缩短预热阶段而无需牺牲性能？
RQ4批量归一化在稳定认证边界和改善认证训练中ReLU激活平衡方面起到什么作用？
RQ5结合改进的初始化、正则化与架构选择，是否能以显著更少的训练轮次实现SOTA认证鲁棒性？

主要发现

所提出的IBP专用初始化方法显著收紧了初始阶段的认证边界，减少了对长预热阶段的依赖。
在预热阶段使用有原则的正则化项可稳定边界传播并加速收敛，使模型能在更少轮次内实现有效训练。
批量归一化被证实对稳定边界方差和平衡ReLU激活状态至关重要，直接提升了认证准确率。
该方法仅用160个总训练轮次就在CIFAR-10上（$epsilon=8/255$）实现了65.03%的验证误差，优于以往SOTA方法（需数百或数千轮次训练）。
在TinyImageNet上（$epsilon=1/255$），该方法仅用80轮次就实现了82.13%的验证误差，相较于现有方法展现出显著的效率提升。
改进的初始化、正则化与批量归一化的结合，实现了在极低训练成本下的SOTA认证鲁棒性。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。