QUICK REVIEW

[论文解读] An Empirical Analysis of the Influence of Fault Space on Search-Based Automated Program Repair

Ming Wen, Junjie Chen|arXiv (Cornell University)|Jul 17, 2017

Software Testing and Debugging Techniques参考文献 49被引用 26

一句话总结

本文通过实证研究探讨了故障空间准确率对基于搜索的自动化程序修复（APR）的影响，发现更高的故障空间准确率能显著提升GenProg等APR工具的有效性和效率。研究识别出负向变异覆盖度是预测故障空间质量最具指示性的指标，并证明自动生成的测试用例能够提升故障空间准确率和APR性能。

ABSTRACT

Automated program repair (APR) has attracted great research attention, and various techniques have been proposed. Search-based APR is one of the most important categories among these techniques. Existing researches focus on the design of effective mutation operators and searching algorithms to better find the correct patch. Despite various efforts, the effectiveness of these techniques are still limited by the search space explosion problem. One of the key factors attribute to this problem is the quality of fault spaces as reported by existing studies. This motivates us to study the importance of the fault space to the success of finding a correct patch. Our empirical study aims to answer three questions. Does the fault space significantly correlate with the performance of search-based APR? If so, are there any indicative measurements to approximate the accuracy of the fault space before applying expensive APR techniques? Are there any automatic methods that can improve the accuracy of the fault space? We observe that the accuracy of the fault space affects the effectiveness and efficiency of search-based APR techniques, e.g., the failure rate of GenProg could be as high as $60\%$ when the real fix location is ranked lower than 10 even though the correct patch is in the search space. Besides, GenProg is able to find more correct patches and with fewer trials when given a fault space with a higher accuracy. We also find that the negative mutation coverage, which is designed in this study to measure the capability of a test suite to kill the mutants created on the statements executed by failing tests, is the most indicative measurement to estimate the efficiency of search-based APR. Finally, we confirm that automated generated test cases can help improve the accuracy of fault spaces, and further improve the performance of search-based APR techniques.

研究动机与目标

探究故障空间准确率与基于搜索的APR技术性能之间的相关性。
识别可衡量的指标，以在昂贵的修复尝试前预测基于搜索的APR的效率。
评估自动测试用例生成是否能提升故障空间准确率，从而增强基于搜索的APR性能。
提供实证证据，说明故障空间质量如何影响搜索空间爆炸问题及补丁发现成功率。

提出的方法

本研究采用GenProg作为基于搜索的APR引擎，评估由不同测试套件生成的各类故障空间。
故障空间通过故障定位技术生成，可疑代码元素的排序列表被用作基于变异的补丁生成的搜索空间。
引入负向变异覆盖度作为新指标，用于衡量测试套件在失败测试执行路径上杀死变异体的能力。
通过EvoSuite等工具进行自动测试用例生成，以扩充测试套件并提升故障空间质量。
通过实证评估比较不同准确率的故障空间在补丁发现成功率和搜索尝试次数方面的表现。
使用统计分析关联故障空间准确率、测试套件覆盖率与APR性能指标。

实验结果

研究问题

RQ1故障空间的准确率是否与基于搜索的APR性能显著相关？
RQ2是否存在可衡量的指标，能够基于故障空间质量预测基于搜索的APR效率？
RQ3自动生成的测试用例是否能提升故障空间的准确率，并增强基于搜索的APR性能？

主要发现

当正确修复位置在故障空间中的排名低于第10位时，GenProg的失败率可高达60%，即使正确补丁位于搜索空间内。
当故障空间准确率更高时，GenProg能以更少的搜索尝试找到更多正确补丁，表明故障空间质量与APR效率之间存在直接相关性。
负向变异覆盖度是估计故障空间准确率最具指示性的指标，在预测APR性能方面优于其他基于覆盖率的测量指标。
自动生成的测试用例显著提升了故障空间准确率，从而改善了基于搜索的APR技术的性能。
测试套件的充分性，特别是对失败测试路径上变异体的杀灭能力，与基于搜索的APR的成功率强相关。
本研究证实，故障空间质量是缓解基于搜索的APR中搜索空间爆炸问题的关键因素。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。