QUICK REVIEW

[论文解读] Not all bytes are equal: Neural byte sieve for fuzzing

Mohit Rajpal, William Blum|arXiv (Cornell University)|Nov 10, 2017

Software Testing and Debugging Techniques参考文献 13被引用 81

一句话总结

本文提出 Augmented-AFL，是一种以神经网络引导的模糊测试方法，学习预测在输入中哪些字节位置适合变异，从而提升若干解析器的代码覆盖率、唯一代码路径和崩溃率。它将神经热力图与 AFL 结合，聚焦变异。

ABSTRACT

Fuzzing is a popular dynamic program analysis technique used to find vulnerabilities in complex software. Fuzzing involves presenting a target program with crafted malicious input designed to cause crashes, buffer overflows, memory errors, and exceptions. Crafting malicious inputs in an efficient manner is a difficult open problem and often the best approach to generating such inputs is through applying uniform random mutations to pre-existing valid inputs (seed files). We present a learning technique that uses neural networks to learn patterns in the input files from past fuzzing explorations to guide future fuzzing explorations. In particular, the neural models learn a function to predict good (and bad) locations in input files to perform fuzzing mutations based on the past mutations and corresponding code coverage information. We implement several neural models including LSTMs and sequence-to-sequence models that can encode variable length input files. We incorporate our models in the state-of-the-art AFL (American Fuzzy Lop) fuzzer and show significant improvements in terms of code coverage, unique code paths, and crashes for various input formats including ELF, PNG, PDF, and XML.

研究动机与目标

通过利用以往的探索和代码覆盖率来驱动模糊测试变异，激发基于学习的方法。
开发能够预测哪些输入位置最可能产生新的代码覆盖的神经模型。
将神经热力图整合入 AFL，以否决低潜力的变异并聚焦努力。
在多种输入格式（ELF、PNG、PDF、XML）上评估其收益与局限。

提出的方法

定义一个热力图函数 f，将输入文件位置映射到变异时产生输入增益的概率。
训练神经模型（LSTM、双向 LSTM、带/不带注意力的 Seq2Seq 变体）以从种子输入和覆盖数据预测逐字节的有用性热力图。
将输入字节表示为比特序列，以捕捉位级结构，并使用基于 RNN 的架构来处理可变长度输入。
通过在变异前查询神经模型并否决那些目标为低预测有用性的位置的变异来增强 AFL（通过设定阈值）。
使用基于种子与其变体之间代码覆盖差异的训练目标，该差异可通过对覆盖位图的逐位严格小于评分函数来近似。

实验结果

研究问题

RQ1神经模型是否能够预测导致新代码覆盖和崩溃的变异位置？
RQ2不同的神经架构（LSTM、双向 LSTM、Seq2Seq、Seq2Seq+Attention）在跨格式预测有用变异位置方面的表现如何？
RQ3与原生 AFL 相比，整合神经热力图对代码覆盖、唯一代码路径数量和崩溃的影响是什么？
RQ4哪些格式最能从神经引导的模糊测试中获益（ELF、PNG、PDF、XML），以及观察到的局限性？

主要发现

与基线 AFL 相比，Augmented-AFL 在 ELF、PNG 和 XML 解析器上通常提高代码覆盖率和唯一代码路径。
Augmented-AFL 会比基线 AFL 在 ELF 和 XML 解析器上产生更多崩溃。
简单、轻量的神经模型在代码覆盖改进方面往往优于更复杂的架构。
某些格式（如 PDF）由于对大输入的模型查询开销而收益有限，表明模型延迟与模糊测试吞吐量之间存在权衡。
在多目标中，神经引导在若干情况下相对于 AFL 一直提高输入增益（前所未见的行为）。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。