[论文解读] Security analysis and enhancement of model compressed deep learning systems under adversarial attacks
本文通过联合分析基于哈希压缩的模型重参数化(model reshaping)与输入扰动,研究了压缩深度学习模型在对抗攻击下的脆弱性。提出了一种梯度抑制防御方法,将MNIST和CIFAR-10数据集上的对抗攻击成功率分别从87.99%和86.74%降低至4.77%和4.64%,且准确率损失极小。
Thanks to recent machine learning model innovation and computing hardware advancement, the state-of-the-art of Deep Neural Network (DNN) is presenting human-level performance for many complex intelligent tasks in real-world applications. However, it also introduces ever-increasing security concerns for those intelligent systems. For example, the emerging adversarial attacks indicate that even very small and often imperceptible adversarial input perturbations can easily mislead the cognitive function of deep learning systems (DLS). Existing DNN adversarial studies are narrowly performed on the ideal software-level DNN models with a focus on single uncertainty factor, i.e. input perturbations, however, the impact of DNN model reshaping on adversarial attacks, which is introduced by various hardware-favorable techniques such as hash-based weight compression during modern DNN hardware implementation, has never been discussed. In this work, we for the first time investigate the multi-factor adversarial attack problem in practical model optimized deep learning systems by jointly considering the DNN model-reshaping (e.g. HashNet based deep compression) and the input perturbations. We first augment adversarial example generating method dedicated to the compressed DNN models by incorporating the software-based approaches and mathematical modeled DNN reshaping. We then conduct a comprehensive robustness and vulnerability analysis of deep compressed DNN models under derived adversarial attacks. A defense technique named gradient inhibition is further developed to ease the generating of adversarial examples thus to effectively mitigate adversarial attacks towards both software and hardware-oriented DNNs. Simulation results show that gradient inhibition can decrease the average success rate of adversarial attacks from 87.99% to 4.77% (from 86.74% to 4.64%) on MNIST (CIFAR-10) benchmark with marginal accuracy degradation across various DNNs.
研究动机与目标
- 填补压缩深度学习系统中的安全空白,其中如权重压缩等模型优化技术在对抗鲁棒性分析中常被忽视。
- 研究基于哈希网络(如HashNet)的模型重参数化与输入扰动对实际DNN部署中对抗脆弱性的影响。
- 开发一种在对抗条件下对软件级与硬件优化DNN均有效的防御机制。
- 在真实压缩与攻击场景下,于多种DNN架构和基准数据集(MNIST、CIFAR-10)上评估鲁棒性。
提出的方法
- 通过整合DNN模型重参数化的数学模型(特别是基于哈希的压缩,如HashNet),扩展现有的对抗样本生成方法,以模拟真实硬件优化的模型。
- 构建联合攻击框架,同时考虑输入级扰动与模型压缩引起的结构变化,实现更真实的对抗威胁建模。
- 提出一种梯度抑制技术,通过在对抗样本生成过程中抑制梯度传播,从而降低模型对对抗输入的敏感性。
- 将梯度抑制技术集成至标准DNN与压缩DNN中,确保其在软件与硬件优化推理流程中的广泛适用性。
- 采用带梯度掩蔽的迭代优化方法,限制基于梯度攻击的有效性,同时保持模型准确率。
- 在不同压缩水平下,使用多种DNN架构在标准基准(MNIST、CIFAR-10)上验证该方法。
实验结果
研究问题
- RQ1基于哈希的模型压缩(如HashNet)如何影响深度神经网络对对抗攻击的鲁棒性?
- RQ2在实际DNN系统中,输入扰动与模型重参数化相结合在多大程度上会加剧对抗脆弱性?
- RQ3像梯度抑制这样的防御机制是否能有效降低未压缩与压缩DNN中的对抗攻击成功率?
- RQ4在压缩DNN中应用梯度抑制时,对抗鲁棒性与模型准确率之间的权衡关系如何?
主要发现
- 所提出的梯度抑制防御方法将MNIST基准上的平均对抗攻击成功率从87.99%降低至4.77%。
- 在CIFAR-10数据集上,应用梯度抑制后,攻击成功率从86.74%降至4.64%。
- 该防御方法在不同压缩水平下,对多种DNN架构均保持高准确率,仅造成轻微下降。
- 研究表明,通过哈希压缩实现的模型重参数化显著改变了对抗脆弱性,使得传统鲁棒性分析不足以适用于硬件优化模型。
- 同时考虑输入扰动与模型压缩,可构建更真实且更严重的对抗威胁模型,适用于实际DNN部署。
- 梯度抑制在软件与硬件导向的DNN中均有效缓解了对抗攻击,展现出广泛的适用性与优异的防御性能。
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。