QUICK REVIEW

[论文解读] To See is Not to Master: Teaching LLMs to Use Private Libraries for Code Generation

Yitong Zhang, Chengze Li|arXiv (Cornell University)|Mar 16, 2026

Software Engineering Research被引用 0

一句话总结

PriCoder 通过基于图的数据生成过程自动合成高质量训练数据，教会大语言模型调用私有库 API，在对私有库导向的代码生成上显著提升，同时对通用代码任务影响最小。

ABSTRACT

Large Language Models (LLMs) have shown strong potential for code generation, yet they remain limited in private-library-oriented code generation, where the goal is to generate code using APIs from private libraries. Existing approaches mainly rely on retrieving private-library API documentation and injecting relevant knowledge into the context at inference time. However, our study shows that this is insufficient: even given accurate required knowledge, LLMs still struggle to invoke private-library APIs effectively. To address this limitation, we propose PriCoder, an approach that teaches LLMs to invoke private-library APIs through automatically synthesized data. Specifically, PriCoder models private-library data synthesis as the construction of a graph, and alternates between two graph operators: (1) Progressive Graph Evolution, which improves data diversity by progressively synthesizing more diverse training samples from basic ones, and (2) Multidimensional Graph Pruning, which improves data quality through a rigorous filtering pipeline. To support rigorous evaluation, we construct two new benchmarks based on recently released libraries that are unfamiliar to the tested models. Experiments on three mainstream LLMs show that PriCoder substantially improves private-library-oriented code generation, yielding gains of over 20% in pass@1 in many settings, while causing negligible impact on general code generation capability. Our code and benchmarks are publicly available at https://github.com/eniacode/PriCoder.

研究动机与目标

仅提供 API 知识不足以使大语言模型在代码生成任务中有效调用私有库 API 的证明
引入 PriCoder，一个基于图的数据合成框架，用于训练大语言模型调用私有库
证明 PriCoder 在未见库上改善私有库导向的代码生成，同时对通用代码任务几乎无影响
提供对评估模型尚不熟悉的库的严格基准，以评估私有库能力

提出的方法

将私有库数据合成建模为含 API 节点和样例节点的图构建
使用渐进图演化扩展图并生成多样化训练样本
通过句法、执行和功能性检查的多维图修剪去除低质量样本
在合成数据集上对大型语言模型进行最大似然训练目标的微调
可能将 PriCoder 与检索增强生成相结合，通过在提示中增加检索到的 API 知识

实验结果

研究问题

RQ1RQ1: 与基线相比，PriCoder 在私有库导向的代码生成中的表现如何？
RQ2RQ2: PriCoder 会否降低在公开基准上的通用代码生成能力？
RQ3RQ3: PriCoder 的每个组件（渐进图演化和多维图修剪）的贡献是什么？
RQ4RQ4: 数据合成规模和模型选择对 PriCoder 的有效性有何影响？

主要发现

PriCoder 在私有库导向的代码生成方面有显著提升，在三大主流 LLM 中的多种设置下，pass@1 提升超过 20%。
在新基准 NdonnxEval 和 NumbaEval 上，pass@k 的提升在许多设置中超过 30%。
这些改进对通用代码生成能力几乎无影响（在公开基准上已有证据）。
消融研究证实，渐进图演化和多维图修剪对有效性都是必不可少的。
两个新私有库基准 NdonnxEval 和 NumbaEval 表明，在没有 PriCoder 的情况下，模型仍然面临 API 调用挑战，凸显该方法的价值。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。