QUICK REVIEW

[论文解读] Content-based Unrestricted Adversarial Attack

Zhaoyu Chen, Bo Li|arXiv (Cornell University)|May 18, 2023

Adversarial Robustness in Machine Learning被引用 10

一句话总结

本文提出对抗性内容攻击（ACA），一种在低维自然图像流形上通过扩散模型进行的无限制对抗攻击，能够生成照片级真实、内容丰富的对抗样本，且在跨模型与防御之间具有高转移性。

ABSTRACT

Unrestricted adversarial attacks typically manipulate the semantic content of an image (e.g., color or texture) to create adversarial examples that are both effective and photorealistic, demonstrating their ability to deceive human perception and deep neural networks with stealth and success. However, current works usually sacrifice unrestricted degrees and subjectively select some image content to guarantee the photorealism of unrestricted adversarial examples, which limits its attack performance. To ensure the photorealism of adversarial examples and boost attack performance, we propose a novel unrestricted attack framework called Content-based Unrestricted Adversarial Attack. By leveraging a low-dimensional manifold that represents natural images, we map the images onto the manifold and optimize them along its adversarial direction. Therefore, within this framework, we implement Adversarial Content Attack based on Stable Diffusion and can generate high transferable unrestricted adversarial examples with various adversarial contents. Extensive experimentation and visualization demonstrate the efficacy of ACA, particularly in surpassing state-of-the-art attacks by an average of 13.3-50.4% and 16.8-48.0% in normally trained models and defense methods, respectively.

研究动机与目标

激发在保持照片级真实感的同时实现多样化内容修改的无限制对抗攻击。
提出一个将图像映射到低维流形并沿对抗方向进行优化的框架。
基于 Stable Diffusion 的 Image Latent Mapping 与 Adversarial Latent Optimization，开发对抗性内容攻击（ACA）。
展示 ACA 相对于普通训练模型和各种防御的更强转移性。

提出的方法

使用 Image Latent Mapping (ILM) 将真实图像映射到扩散模型的潜在空间。
使用空文本嵌入和语义文本嵌入在最小化伪影的同时重构潜在表示。
通过 Adversarial Latent Optimization (ALO) 在扩散潜在空间沿对抗方向对潜在表示进行优化。
定义一个对抗目标，在最大化交叉熵损失的同时最小化与原始图像的 L2 距离，并对去噪过程使用跳过梯度的近似。
应用可微分边界处理以约束潜在值，并对潜在空间中的扰动采用基于动量的迭代更新。
采用无分类器引导，使用较高的默认引导权重以及空文本优化在反演过程中维持图像真实感。

实验结果

研究问题

RQ1是否能够在对齐良好的低维流形上生成无限制对抗样本，同时保持照片级真实感并提升转移性？
RQ2在扩散模型的潜在空间中进行优化是否比现有的无限制攻击产生更具多样性和转移性的对抗内容？
RQ3ACA 相对于普通训练的模型和当前针对 CNN 与 ViT 的对抗防御的表现如何？
RQ4引入跳过梯度和可微分边界处理是否提升潜在空间对抗优化的稳定性和真实感？

主要发现

ACA 达到较高的转移性，在对普通训练模型的平均评估中，超越最先进的无限制攻击，幅度为 13.3% 到 50.4%。
在多种防御下，ACA 相较于竞争方法的平均提升约为 16.8% 到 48.0%。
在与 ImageNet 兼容的数据上的实验显示 ACA 在 CNN 与 ViT 上均有效，且覆盖多种代理-目标对。
图像质量指标表明与基线相比，ACA 保留或提升了人眼感知的图像质量。
该方法利用扩散模型流形来合成多样化的对抗内容（形状、纹理、颜色），同时保持照片真实感。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。