Skip to main content
QUICK REVIEW

[论文解读] Towards an Understanding of Large Language Models in Software Engineering Tasks

Zibin Zheng, Kaiwen Ning|arXiv (Cornell University)|Aug 22, 2023
Topic Modeling被引用 10
一句话总结

这篇论文是对大型语言模型(LLMs)在软件工程中的应用的首个系统性综述,分类七种任务类型并评估LLMs在何处表现良好。它分析来自六个数据库的123项研究,以绘制趋势与有效性。

ABSTRACT

Large Language Models (LLMs) have drawn widespread attention and research due to their astounding performance in text generation and reasoning tasks. Derivative products, like ChatGPT, have been extensively deployed and highly sought after. Meanwhile, the evaluation and optimization of LLMs in software engineering tasks, such as code generation, have become a research focus. However, there is still a lack of systematic research on applying and evaluating LLMs in software engineering. Therefore, this paper comprehensively investigate and collate the research and products combining LLMs with software engineering, aiming to answer two questions: (1) What are the current integrations of LLMs with software engineering? (2) Can LLMs effectively handle software engineering tasks? To find the answers, we have collected related literature as extensively as possible from seven mainstream databases and selected 123 timely papers published starting from 2022 for analysis. We have categorized these papers in detail and reviewed the current research status of LLMs from the perspective of seven major software engineering tasks, hoping this will help researchers better grasp the research trends and address the issues when applying LLMs. Meanwhile, we have also organized and presented papers with evaluation content to reveal the performance and effectiveness of LLMs in various software engineering tasks, guiding researchers and developers to optimize.

研究动机与目标

  • 调查将LLMs与软件工程任务整合的当前格局。
  • 将现有工作分为七种软件工程任务类型。
  • 评估LLMs是否提高软件工程任务的性能以及原因。
  • 为研究人员在将LLMs应用于软件工程中应对挑战提供指导。

提出的方法

  • 文献检索覆盖六个数据库:ACM DL、IEEE Xplore、dblp、Elsevier Science Direct、Google Scholar、arXiv。
  • 卡片排序(封闭式)以识别相关与不相关的论文。
  • 排除非英文、论文、主题演讲论文、非LLM、非软件工程相关工作、重复项以及2022年之前的研究。
  • 数据分析包括阅读论文以回答关于应用与性能的两个研究问题。

实验结果

研究问题

  • RQ1RQ1:目前聚焦于将LLMs与软件工程结合的研究有哪些?
  • RQ2RQ2:LLMs真的能帮助更好地完成当前的软件工程任务吗?

主要发现

  • LLMs在代码摘要与修复等语法相关任务中表现出优势。
  • 在语义密集型任务如代码生成与漏洞检测方面还不尽如人意,尽管通过模型迭代取得进展。
  • 共识别并分类了123篇相关论文,分为七种软件工程任务。
  • 代码生成是研究最多的类别(24篇),而代码翻译是最少的(3篇)。
  • 论文提供了对当前状态、应用和评估内容的结构化视图,以指导优化。
  • 从规模化中出现的能力(如情境学习、指令遵循)被讨论为影响SE任务中LLM性能的因素。

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。