QUICK REVIEW

[论文解读] Robust Pre-Training by Adversarial Contrastive Learning

Ziyu Jiang, Tianlong Chen|arXiv (Cornell University)|Oct 26, 2020

Adversarial Robustness in Machine Learning参考文献 46被引用 72

一句话总结

本论文提出对抗对比学习（ACL），包含三种变体，选择 Dual-Stream（DS）作为最佳方案，以预训练对对抗扰动和数据增强具有鲁棒性的表征，在有监督和半监督设置中同时提升鲁棒性和标准准确率。

ABSTRACT

Recent work has shown that, when integrated with adversarial training, self-supervised pre-training can lead to state-of-the-art robustness In this work, we improve robustness-aware self-supervised pre-training by learning representations that are consistent under both data augmentations and adversarial perturbations. Our approach leverages a recent contrastive learning framework, which learns representations by maximizing feature consistency under differently augmented views. This fits particularly well with the goal of adversarial robustness, as one cause of adversarial fragility is the lack of feature invariance, i.e., small input perturbations can result in undesirable large changes in features or even predicted labels. We explore various options to formulate the contrastive task, and demonstrate that by injecting adversarial perturbations, contrastive pre-training can lead to models that are both label-efficient and robust. We empirically evaluate the proposed Adversarial Contrastive Learning (ACL) and show it can consistently outperform existing methods. For example on the CIFAR-10 dataset, ACL outperforms the previous state-of-the-art unsupervised robust pre-training approach by 2.99% on robust accuracy and 2.14% on standard accuracy. We further demonstrate that ACL pre-training can improve semi-supervised adversarial training, even when only a few labeled examples are available. Our codes and pre-trained models have been released at: https://github.com/VITA-Group/Adversarial-Contrastive-Learning.

研究动机与目标

通过利用未标注数据来推动视觉模型的标注高效且鲁棒的训练。
将对比自监督与对抗训练结合，以在数据增强和扰动下强制特征的一致性。
探索多种 ACL 形式（A2A、A2S、DS），并找出表现最佳的设置。
在 CIFAR-10/100 的完全监督和半监督微调中展示鲁棒性和准确性的提升。

提出的方法

以 SimCLR 对比学习为基础，在数据增强下学习不变的表征。
引入对抗对比学习（ACL），包含三种变体：对抗到对抗（A2A）、对抗到标准（A2S）和双流（DS）。
在 A2A 中，对两个增强视图生成对抗扰动，并最大化它们之间的相似性损失。
在 A2S 中，扰动一个视图并保持另一个为标准，使用双 BN 来分离统计信息。
在 DS 中，将标准视图对和对抗视图对结合起来，使用两条 BR 分支共享权重但分开 BN，平衡标准与鲁棒目标。
提供一个监督微调和一个半监督训练流程，使用 ACL 预训练（带有将交叉熵、蒸馏和鲁棒性正则化混合的特定损失）。
突出训练细节：预训练使用 1000 个 epoch，预训练阶段使用基于 PGD 的扰动（5 步），微调阶段使用标准 AT。

实验结果

研究问题

RQ1是否可以通过将对抗视图引入对比目标，使对比预训练对对抗扰动具备鲁棒性？
RQ2哪种 ACL 变体（A2A、A2S、DS）在标准准确率与鲁棒性之间取得最佳权衡？
RQ3在半监督设置下，包括极低标签情形，ACL 预训练是否提升了标签高效的对抗训练？
RQ4在使用对抗分支与标准分支时，双 BN 配置如何影响鲁棒性和标准准确率？
RQ5在对抗评估下，对抗对比预训练是否产生更线性可分的表征？

主要发现

相较于以往的鲁棒预训练方法，ACL 双流（DS）预训练在 CIFAR-10 与 CIFAR-100 上实现了 TA/RA 的最新水平。
在 CIFAR-10 上，DS 得到 TA 82.19% 和 RA 52.82%，比以前的最佳提升 TA 2.14%、RA 2.99%。
在 CIFAR-100 上，DS 得到 TA 56.77% 和 RA 28.33%，比以前的最佳提升 TA 2.14%、RA 3.58%。
在 10% 标签的半监督设置下，ACL（DS）在 Selfie/UAT++ 基线之上将 TA 提升 4.50%、RA 提升 0.72%，并且在 1% 标签时差距仍然较大（例如 DS 为 75.66 TA、50.67 RA，而其他方法显著更低）。
ACL 预训练带来更高的伪标签准确度（86.73%），高于基线（如 37.67% 或 46.75%），有助于提升半监督鲁棒性。
消融实验显示 DS 在平衡鲁棒性与特征质量方面优于 A2A 和 A2S，而 SimCLR (S2S) 提升 TA 但对 RA 的提升有限。
在测试时使用对抗的双 BN（adv）比单 BN 提供更好的鲁棒性，通常在鲁棒评估中偏好 DS。
从 ACL（DS）预训练开始微调时鲁棒性提升更快，在很早阶段就达到更高的 RA，相较于随机初始化，并且观察到的鲁棒性动态取决于学习率计划。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。