QUICK REVIEW

[论文解读] InCoder: A Generative Model for Code Infilling and Synthesis

Daniel Fried, Armen Aghajanyan|arXiv (Cornell University)|Apr 12, 2022

Software Engineering Research被引用 140

一句话总结

InCoder 是一个统一的 6.7B 变换模型，训练时使用因果掩码目标以同时执行代码合成和填充。它利用双向上下文在零-shot 任务中对任意代码区域进行填充，同时保持从左到右的生成能力。

ABSTRACT

Code is seldom written in a single left-to-right pass and is instead repeatedly edited and refined. We introduce InCoder, a unified generative model that can perform program synthesis (via left-to-right generation) as well as editing (via infilling). InCoder is trained to generate code files from a large corpus of permissively licensed code, where regions of code have been randomly masked and moved to the end of each file, allowing code infilling with bidirectional context. Our model is the first generative model that is able to directly perform zero-shot code infilling, which we evaluate on challenging tasks such as type inference, comment generation, and variable re-naming. We find that the ability to condition on bidirectional context substantially improves performance on these tasks, while still performing comparably on standard program synthesis benchmarks in comparison to left-to-right only models pretrained at similar scale. The InCoder models and code are publicly released. https://sites.google.com/view/incoder-code-models

研究动机与目标

促成一个能够同时进行程序合成和代码编辑（填充）的统一模型。
开发一个训练目标，使在任意代码段中使用双向上下文实现填充。
在多样任务上评估零-shot 填充性能（类型推断、文档字符串生成、变量重命名），并与从左到右的生成进行比较。
证明双向填充在不损害从左到右合成能力的前提下提升任务性能。

提出的方法

采用一个因果掩码目标，随机掩蔽代码中的区间并将其移到序列末尾，训练模型预测完整序列。
在包含28种语言的大规模许可代码和 StackOverflow 内容语料上训练 InCoder-6.7B，Python 为主要关注对象。
在推理阶段，支持从左到右生成和填充，通过在期望位置插入掩码并在左、右上下文的条件下生成替换。
在类型推断、注释生成和变量重命名等任务上评估零-shot 填充，比较因果掩码填充与从左到右生成及再排序基线。
与掩码语言模型方法和大型从左到右模型进行对比，展示填充的好处而不牺牲合成能力。
提供分析，包括对模型规模、数据和目标的消融，并报告填充基准的主要结果。

实验结果

研究问题

RQ1因果掩码能否在以双向上下文为条件的零-shot填充任意代码区段？
RQ2在填充中双向上下文是否相对于从左到右基线在类型推断、文档字符串生成和变量重命名等任务上表现更好？
RQ3填充是否会伤害，或者至少不会显著伤害标准的从左到右代码合成性能？
RQ4模型规模和训练数据比例对填充和合成能力有何影响？

主要发现

方法	通过率	完全匹配
L-R single	48.2	38.7
L-R reranking	54.9	44.1
CM infilling	69.0	56.3
PLBART	41.6	—
code-cushman-001	53.1	42.0
code-davinci-001	63.0	56.0

零-shot 因果掩码填充在填充任务（单行与多行）上显著优于从左到右基线。
双向上下文在类型推断和返回类型预测上表现强劲，因果掩码填充相对于基线取得实质性提升。
零-shot 设置中的文档字符串生成接近有监督、微调模型的性能。
InCoder 在标准基准测试上保持与从左到右合成能力竞争力，尽管存在填充目标。
使用多语言和 StackOverflow 数据进行训练，相对于仅 Python 的数据，提升了性能，尽管多语言数据可能略微降低 Python 相关结果。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。