QUICK REVIEW

[论文解读] Universal Adversarial Audio Perturbations

Sajjad Abdoli, Luiz G. Hafemann|arXiv (Cornell University)|Aug 8, 2019

Adversarial Robustness in Machine Learning参考文献 63被引用 35

一句话总结

论文证明了音频的通用对抗扰动的存在，并提出两种制造方法：迭代贪心法和新的基于惩罚的方法，在多种音频分类器上实现高攻击成功率。

ABSTRACT

We demonstrate the existence of universal adversarial perturbations, which can fool a family of audio classification architectures, for both targeted and untargeted attack scenarios. We propose two methods for finding such perturbations. The first method is based on an iterative, greedy approach that is well-known in computer vision: it aggregates small perturbations to the input so as to push it to the decision boundary. The second method, which is the main contribution of this work, is a novel penalty formulation, which finds targeted and untargeted universal adversarial perturbations. Differently from the greedy approach, the penalty method minimizes an appropriate objective function on a batch of samples. Therefore, it produces more successful attacks when the number of training samples is limited. Moreover, we provide a proof that the proposed penalty method theoretically converges to a solution that corresponds to universal adversarial perturbations. We also demonstrate that it is possible to provide successful attacks using the penalty method when only one sample from the target dataset is available for the attacker. Experimental results on attacking various 1D CNN architectures have shown attack success rates higher than 85.0% and 83.1% for targeted and untargeted attacks, respectively using the proposed penalty method.

研究动机与目标

在音频分类中动机化并形式化通用扰动。
演示两种制造通用音频扰动的方法（贪心法和基于惩罚的方法）。
证明所提惩罚法的理论收敛性。
在多种端到端音频架构上评估针对环境声音分类和语音命令识别的攻击。

提出的方法

将通用扰动问题形式化为音频输入并定义两个约束：扰动范数界限和欺骗率。
通过将最小扰动聚合以推动样本达到决策边界，将迭代贪心方法用于音频域。
引入一种惩罚优化，在一批样本上最小化目标函数，使用 tanh-space 变量变换来强制音频盒状约束。
使用 SPL (dB) 作为感知扰动大小的度量，并将优化变量转换以确保在 [0,1] 音频范围内的有效性。
定义一个可微目标，结合 SPL(v) 与基于 hinge 的惩罚 G(w,t)，以强制实现定向或非定向误分类。
在惩罚形式下证明收敛到带约束问题的解（定理1）。
在小批量数据上使用 Adam 优化器训练，并在多种音频架构上比较两种方法。

实验结果

研究问题

RQ1一个固定的通用扰动能否在定向和非定向目标下欺骗一系列音频分类器？
RQ2贪心法和惩罚基方法在有效性上有何比较，尤其是在训练样本有限时？
RQ3惩罚形式是否收敛到通用扰动并在模型间具有可迁移性？
RQ4在实现高欺骗率的同时，扰动的感知影响（SPL、SNR）可以接受到何种程度？
RQ5仅用一个样本是否足以为音频任务制造有效的通用扰动？

主要发现

通用对抗扰动存在于音频领域，能够在定向和非定向设置中欺骗一族分类器。
惩罚基方法在目标模型上的攻击成功率普遍高于迭代贪心方法。
在环境声音模型上，惩罚法达到更高的平均 ASR 提升，同时保持与贪心法相当的 SNR。
在语音命令识别任务中，惩罚法获得具有竞争力的 ASR，同时有时提升感知响度指标。
在惩罚法下，针对目标攻击的攻击成功率超过 85%，针对非定向攻击则超过 83.1%（在所报告的数据集上）。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。