QUICK REVIEW

[论文解读] MaskDGA: A Black-box Evasion Technique Against DGA Classifiers and Adversarial Defenses

Lior Sidi, Asaf Nadler|arXiv (Cornell University)|Feb 24, 2019

Adversarial Robustness in Machine Learning参考文献 42被引用 25

一句话总结

MaskDGA 是一种黑盒对抗性逃避技术，通过扰动算法生成域名（AGD）的字符级表示，以在不掌握分类器架构的情况下逃避最先进的 DGA 分类器。在 DMD-2018 数据集上，该技术使四种 DGA 分类器的 F1 分数从 0.977 降低至 0.495，证明其在逃避标准模型和对抗性防御模型方面具有高度有效性和可迁移性。

ABSTRACT

Domain generation algorithms (DGAs) are commonly used by botnets to generate domain names through which bots can establish a resilient communication channel with their command and control servers. Recent publications presented deep learning, character-level classifiers that are able to detect algorithmically generated domain (AGD) names with high accuracy, and correspondingly, significantly reduce the effectiveness of DGAs for botnet communication. In this paper we present MaskDGA, a practical adversarial learning technique that adds perturbation to the character-level representation of algorithmically generated domain names in order to evade DGA classifiers, without the attacker having any knowledge about the DGA classifier's architecture and parameters. MaskDGA was evaluated using the DMD-2018 dataset of AGD names and four recently published DGA classifiers, in which the average F1-score of the classifiers degrades from 0.977 to 0.495 when applying the evasion technique. An additional evaluation was conducted using the same classifiers but with adversarial defenses implemented: adversarial re-training and distillation. The results of this evaluation show that MaskDGA can be used for improving the robustness of the character-level DGA classifiers against adversarial attacks, but that ideally DGA classifiers should incorporate additional features alongside character-level features that are demonstrated in this study to be vulnerable to adversarial attacks.

研究动机与目标

开发一种实用的黑盒逃避技术，无需了解其内部架构或参数，即可绕过字符级 DGA 分类器。
评估最先进的 DGA 分类器在真实部署场景下对对抗性攻击的鲁棒性。
评估对抗性防御机制（如对抗性微调和知识蒸馏）对所提出的逃避技术的有效性。
探究将字符级特征与上下文特征（如 DNS 流量、WHOIS 信息）结合是否能提升分类器对对抗性攻击的鲁棒性。
探索该技术在 DGA 检测之外的其他字符级分类任务中的泛化能力，例如钓鱼 URL 检测或恶意软件字符串逃避。

提出的方法

在公开的 AGD 数据集上训练替代模型，以模拟目标 DGA 分类器的行为。
利用替代模型通过反向传播计算输入字符级表示的梯度，生成对抗性样本。
构建雅可比显著性图（JSM），识别对误分类影响最大的 AGD 中的字符。
将每个 AGD 中梯度值最高的字符精确替换为一半，确保域名唯一性并避免重复。
通过单次前向传播和反向传播步骤，生成被错误分类为良性域名的对抗性域名。
在四种最先进的 DGA 分类器及两种对抗性防御机制（对抗性微调和知识蒸馏）上评估该技术。

实验结果

研究问题

RQ1在无法访问内部参数或架构的情况下，黑盒对抗性逃避技术能否有效绕过字符级 DGA 分类器？
RQ2MaskDGA 在 DMD-2018 数据集上使最先进的 DGA 分类器性能下降的程度如何？
RQ3对抗性防御机制（如对抗性微调和知识蒸馏）在多大程度上能缓解 MaskDGA 的有效性？
RQ4当暴露于字符级表示的基于梯度的对抗性攻击时，当前 DGA 分类器存在哪些局限性？
RQ5MaskDGA 能否泛化到其他字符级分类任务（如钓鱼检测或恶意软件签名逃避）？

主要发现

MaskDGA 在 DMD-2018 数据集上将四种最先进的 DGA 分类器的平均 F1 分数从 0.977 降低至 0.495，表明检测性能出现显著下降。
对抗性微调虽提升了对 DeepDGA 攻击的鲁棒性，但导致在非对抗性 AGD 样本上的性能下降，表明模型鲁棒性存在权衡。
知识蒸馏对目标误分类无效，表明其对 MaskDGA 所用特定扰动策略的抵抗能力有限。
在 MaskDGA 生成的对抗性样本上微调分类器可提升平均鲁棒性，但无法完全防御其他攻击（如 DeepDGA）。
本研究证实字符级 DGA 分类器易受基于梯度的对抗性攻击影响，凸显了多模态特征融合的必要性。
该技术通过仅替换一半字符并确保无重复，保持了域名的唯一性，从而可在不被检测的情况下实现大规模僵尸网络通信。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。