QUICK REVIEW

[论文解读] Towards Adversarial Malware Detection: Lessons Learned from PDF-based Attacks

Davide Maiorca, Battista Biggio|arXiv (Cornell University)|Nov 2, 2018

Advanced Malware Detection Techniques参考文献 84被引用 40

一句话总结

对基于对抗性威胁的PDF恶意软件检测器的综述，提供PDF恶意软件的分类、基于学习的检测器、攻击向量和防御方向。

ABSTRACT

Malware still constitutes a major threat in the cybersecurity landscape, also due to the widespread use of infection vectors such as documents. These infection vectors hide embedded malicious code to the victim users, facilitating the use of social engineering techniques to infect their machines. Research showed that machine-learning algorithms provide effective detection mechanisms against such threats, but the existence of an arms race in adversarial settings has recently challenged such systems. In this work, we focus on malware embedded in PDF files as a representative case of such an arms race. We start by providing a comprehensive taxonomy of the different approaches used to generate PDF malware, and of the corresponding learning-based detection systems. We then categorize threats specifically targeted against learning-based PDF malware detectors, using a well-established framework in the field of adversarial machine learning. This framework allows us to categorize known vulnerabilities of learning-based PDF malware detectors and to identify novel attacks that may threaten such systems, along with the potential defense mechanisms that can mitigate the impact of such threats. We conclude the paper by discussing how such findings highlight promising research directions towards tackling the more general challenge of designing robust malware detectors in adversarial settings.

研究动机与目标

描述PDF文件如何作为恶意软件传播向量，以及为什么它们对检测器具有挑战性。
调查最前沿的基于学习的PDF恶意软件检测器及其典型架构。
提供针对学习型PDF检测器的对抗性攻击分类，并分析漏洞。
确定在对抗性环境中提高鲁棒性的潜在防御和研究方向。
倡导在恶意软件检测系统中采用设计即安全的原则。

提出的方法

提供PDF恶意软件生成方法及相应学习型检测器的全面分类。
描述基于机器学习的检测器的三组件架构：预处理、特征提取和分类器。
评审第三方和自定义预处理解析器及其能力。
将检测器特征分类为结构化、基于JavaScript、原始字节三类，并将它们映射到检测器。
综合已知的针对PDF检测器的对抗攻击策略并讨论实际实现。
概述防御机制和面向未来的研究方向，以实现对抗性意识的鲁棒恶意软件检测器。

实验结果

研究问题

RQ1在实际环境中使用的主要基于PDF的恶意软件技术有哪些，检测器如何演化以对抗它们？
RQ2学习型PDF检测器存在哪些漏洞，使得规避攻击成为可能？
RQ3如何改进检测器设计，以在保持检测性能的同时抵御对抗性操纵？
RQ4哪些防御策略在缓解针对PDF恶意软件检测器的对抗性攻击方面具有潜力？
RQ5在对抗性环境中，鲁棒恶意软件检测的新研究方向有哪些？

主要发现

PDF恶意软件利用三大通道：基于JavaScript、基于ActionScript以及文件嵌入，历史上基于JavaScript的攻击最为常见。
存在广泛的检测器架构，结合静态或动态预处理、各种特征类型和分类器，但都以机器学习为基础。
攻击者与防御者之间存在显著的军备竞赛，攻击者越来越利用解析器漏洞和混淆来规避检测。
对抗性攻击可以针对不同组件（预处理解析器、特征提取器和分类器）实施，以在不进行重大代码变更的情况下规避检测。
对第三方解析器的依赖很普遍，但引入了安全性和鲁棒性方面的担忧；然而没有任何解析器能够覆盖所有PDF元素，存在可利用的漏洞。
该工作强调设计即安全作为构建更鲁棒的恶意软件检测器的指导原则，并概述未来防御方向。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。