QUICK REVIEW

[论文解读] Automatic Code Summarization: A Systematic Literature Review

Yuxiang Zhu, Minxue Pan|arXiv (Cornell University)|Sep 10, 2019

Natural Language Processing Techniques参考文献 16被引用 23

一句话总结

本篇系统文献综述分析了41项关于自动代码摘要生成的研究，评估了数据提取、代码描述生成方法、评估技术及研究成果。该综述全面概述了当前最先进的方法，识别了研究空白，并提出了软件工程领域程序理解与注释生成的未来研究方向。

ABSTRACT

Background: During software maintenance and development, the comprehension of program code is key to success. High-quality comments can help us better understand programs, but they're often missing or outmoded in today's programs. Automatic code summarization is proposed to solve these problems. During the last decade, huge progress has been made in this field, but there is a lack of an up-to-date survey. Aims: We studied publications concerning code summarization in the field of program comprehension to investigate state-of-the-art approaches. By reading and analyzing relevant articles, we aim at obtaining a comprehensive understanding of the current status of automatic code summarization. Method: In this paper, we performed a systematic literature review over the automatic source code summarization field. Furthermore, we synthesized the obtained data and investigated different approaches. Results: We successfully collected and analyzed 41 selected studies from the different research communities. We exhaustively investigated and described the data extraction techniques, description generation methods, evaluation methods and relevant artifacts of those works. Conclusions: Our systematic review provides an overview of the state of the art, and we also discuss further research directions. By fully elaborating current approaches in the field, our work sheds light on future research directions of program comprehension and comment generation.

研究动机与目标

调查自动代码摘要在程序理解方面的最新研究进展。
识别由于软件系统中缺少或过时的注释而导致的代码理解挑战。
综合软件工程中不同研究领域内41项研究的发现。
评估代码摘要研究中使用的数据提取技术、描述生成方法及评估策略。
基于识别出的研究空白与趋势，提出未来研究方向。

提出的方法

使用预定义的搜索标准，在学术数据库和代码库中开展系统文献综述。
根据与代码摘要和程序理解相关的纳入与排除标准，筛选出41项相关研究。
按研究领域、数据源、模型架构及评估指标对研究进行分类。
分析数据提取技术，包括基于抽象语法树（AST）、基于嵌入（embedding）及自然语言处理方法。
综合代码描述生成方法的发现，如序列到序列模型、注意力机制及预训练模型。
评估评估协议的质量与一致性，包括BLEU、ROUGE及人工评估。

实验结果

研究问题

RQ1自动代码摘要研究中，主流的数据源和预处理技术是什么？
RQ2不同模型架构在各种编程语言上的代码摘要性能表现如何？
RQ3最常用的评估指标是什么？它们与人类对摘要质量判断的相关性如何？
RQ4当前代码摘要方法面临的主要挑战与局限性是什么？
RQ5基于现有文献的综合分析，未来研究方向有哪些？

主要发现

该综述识别出41项研究，代表了过去十年中自动代码摘要领域的显著进展。
带有注意力机制的序列到序列模型是代码摘要中最广泛采用的架构。
预训练模型如CodeBERT和GraphCodeBERT在代码摘要基准测试中表现出色。
BLEU和ROUGE仍是使用最频繁的自动评估指标，但与人类判断的相关性有限。
尽管人工评估在评估生成摘要的语义质量方面至关重要，但其应用仍不足。
缺乏标准化的数据集和评估协议，阻碍了研究的可重现性与跨研究比较。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。