[论文解读] RobustART: Benchmarking Robustness on Architecture Design and Training Techniques
RobustART 提供了首个全面基准,测量架构设计和通用训练技巧在 ImageNet 上对 DNN 鲁棒性的影响,覆盖对抗性、自然噪声和系统噪声。
Deep neural networks (DNNs) are vulnerable to adversarial noises, which motivates the benchmark of model robustness. Existing benchmarks mainly focus on evaluating defenses, but there are no comprehensive studies of how architecture design and training techniques affect robustness. Comprehensively benchmarking their relationships is beneficial for better understanding and developing robust DNNs. Thus, we propose RobustART, the first comprehensive Robustness investigation benchmark on ImageNet regarding ARchitecture design (49 human-designed off-the-shelf architectures and 1200+ networks from neural architecture search) and Training techniques (10+ techniques, e.g., data augmentation) towards diverse noises (adversarial, natural, and system noises). Extensive experiments substantiated several insights for the first time, e.g., (1) adversarial training is effective for the robustness against all noises types for Transformers and MLP-Mixers; (2) given comparable model sizes and aligned training settings, CNNs > Transformers > MLP-Mixers on robustness against natural and system noises; Transformers > MLP-Mixers > CNNs on adversarial robustness; (3) for some light-weight architectures, increasing model sizes or using extra data cannot improve robustness. Our benchmark presents: (1) an open-source platform for comprehensive robustness evaluation; (2) a variety of pre-trained models to facilitate robustness evaluation; and (3) a new view to better understand the mechanism towards designing robust DNNs. We will continuously develop to this ecosystem for the community.
研究动机与目标
- 研究架构设计如何在多样化噪声类型下影响鲁棒性。
- 评估通用训练技术对鲁棒性的影响,与防御方法的重点无关。
- 提供一个大规模、开源的基准,以澄清不同模型家族的鲁棒性机制。
提出的方法
- 在对齐的训练设置下,对来自49种人工设计架构和1,200多个 NAS 派生子网的 1,000+ 模型进行基准测试。
- 汇总10种以上训练技术,涵盖数据增强、对抗训练和模型优化。
- 评估对抗性噪声、自然噪声(ImageNet-C/P/A/O)以及系统噪声(ImageNet-S)上的鲁棒性。
- 使用指标如对抗鲁棒性(AR)、最坏情形攻击鲁棒性(WCAR)、mCE、NmFP、AUPR 和 NSD。
- 提供一个开源框架,包含模型库、训练接口、噪声生成器和评估 API。
实验结果
研究问题
- RQ1在不同噪声下,不同架构家族(CNN、Transformer、MLP-Mixers)在鲁棒性方面的差异如何?
- RQ2在同一架构家族内及跨架构家族,增大模型规模或容量对鲁棒性有何影响?
- RQ3通用训练技术如何影响鲁棒性,与特定防御方法无关?
- RQ4在对抗性、自然噪声和系统噪声下,NAS 派生的子网是否比人工设计的架构更鲁棒?
主要发现
- 对抗性训练提升 Transformer 和 MLP-Mixers 在所有噪声类型上的鲁棒性。
- 在相近规模和设置下,CNN 在自然噪声和系统噪声上表现更佳,而 Transformer 在对抗鲁棒性方面表现出色。
- 增加模型容量通常会提升大多数架构的鲁棒性,但对某些轻量级架构则不一定有效,增大规模或数据量未必有帮助。
- NAS 抽样的子网显示输入尺寸倾向于降低对抗鲁棒性,而更深的后段和更大的总核尺寸可以提升鲁棒性。
- Sw...n Transformers 在某些噪声类型下的鲁棒性模式与 CNN 更接近,可能归因于诸如窗口注意力和层级结构等架构特征。
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。