QUICK REVIEW

[论文解读] Retrieval-Augmented Generation for Code Summarization via Hybrid GNN

Shangqing Liu, Yu Chen|arXiv (Cornell University)|Jun 9, 2020

Natural Language Processing Techniques参考文献 41被引用 68

一句话总结

该论文提出 Retrieval-Augmented Hybrid Graph Neural Network (HGNN) 用于代码摘要，将检索增强与融合静态和动态图的混合 GNN 相结合，以为 C 代码生成自然语言摘要，在新的 CCSD 基准上达到最先进的结果。

ABSTRACT

Source code summarization aims to generate natural language summaries from structured code snippets for better understanding code functionalities. However, automatic code summarization is challenging due to the complexity of the source code and the language gap between the source code and natural language summaries. Most previous approaches either rely on retrieval-based (which can take advantage of similar examples seen from the retrieval database, but have low generalization performance) or generation-based methods (which have better generalization performance, but cannot take advantage of similar examples). This paper proposes a novel retrieval-augmented mechanism to combine the benefits of both worlds. Furthermore, to mitigate the limitation of Graph Neural Networks (GNNs) on capturing global graph structure information of source code, we propose a novel attention-based dynamic graph to complement the static graph representation of the source code, and design a hybrid message passing GNN for capturing both the local and global structural information. To evaluate the proposed approach, we release a new challenging benchmark, crawled from diversified large-scale open-source C projects (total 95k+ unique functions in the dataset). Our method achieves the state-of-the-art performance, improving existing methods by 1.42, 2.44 and 1.29 in terms of BLEU-4, ROUGE-L and METEOR.

研究动机与目标

Motivate automatic code summarization due to the heterogeneity and complexity of code and summaries.
Develop a retrieval-augmented generation framework that leverages similar code and summaries from a database.
Introduce a Hybrid GNN that couples a static code-property graph with a dynamically constructed global-attention graph for global information flow.
Release a large C code summarization benchmark (CCSD) and demonstrate state-of-the-art performance.
Provide ablations and human evaluation to validate the contributions and robustness.

提出的方法

Construct a Code Property Graph (CPG) from AST with multiple edge types and encode nodes with BiLSTM-based representations.
Introduce a retrieval-based augmentation to inject retrieved code semantics via attention between current and retrieved CPGs, then merge augmented and original node representations.
Compute a structure-aware dynamic graph A_dyn to enable global attention-based message passing between any node pair.
Perform Hybrid Message Passing (HMP) that fuses static (augmented) and dynamic graph information using a gated fusion mechanism and GRU updates, followed by graph-level max-pooling for a representation.
Decode summaries with an attention-based LSTM that attends over the final graph representation and the retrieved summary features, trained with cross-entropy loss and schedule sampling.
Evaluate against retrieval-based, sequence-based, and graph-based baselines on the CCSD dataset of 95k+ C function-summary pairs, including in-domain and out-of-domain splits.

实验结果

研究问题

RQ1Can a retrieval-augmented framework improve code summarization by leveraging similar existing code and summaries?
RQ2Does combining a static, retrieval-augmented code graph with a dynamic global-attention graph capture both local and global code semantics for better summaries?
RQ3What is the contribution of code-based vs. summary-based augmentation to summary quality?
RQ4Does the proposed HGNN generalize across in-domain and out-of-domain code, especially in C language?
RQ5How does HGNN compare to state-of-the-art baselines in automatic and human evaluations?

主要发现

方法	BLEU-4（In-domain）	ROUGE-L（In-domain）	METEOR（In-domain）	BLEU-4（Out-of-domain）	ROUGE-L（Out-of-domain）	METEOR（Out-of-domain）	BLEU-4（Overall）	ROUGE-L（Overall）	METEOR（Overall）
HGNN 无增强	12.33	29.99	13.78	5.45	22.07	12.32	10.26	27.17	12.32
HGNN 无静态	15.93	33.67	15.67	7.72	24.69	10.63	13.44	30.47	13.98
HGNN 无动态	15.77	33.84	15.67	7.64	24.72	10.73	13.31	30.59	14.01
HGNN 无增强 & 静态	11.75	29.59	13.86	5.57	22.14	9.41	9.98	26.94	12.05
HGNN 无增强 & 动态	11.85	29.51	13.54	5.45	21.89	9.59	9.93	26.80	12.21
HGNN	16.72	34.29	16.25	7.85	24.74	11.05	14.01	30.89	14.50

HGNN outperforms baselines on BLEU-4, ROUGE-L, and METEOR across in-domain and out-of-domain data (overall gains 1.42, 2.44, 1.29 over Rencos for BLEU-4, ROUGE-L, METEOR).
Retrieval augmentation contributes to performance, with additional gains when combined with static and dynamic graphs.
Static and dynamic graphs both contribute; removing static or dynamic components reduces performance, with static contribution particularly impactful for ROUGE-L and METEOR.
Summary-based augmentation yields larger gains than code-based augmentation, and combining both yields the best overall results.
Human evaluation shows HGNN achieving higher relevance and similarity scores compared to NNGen, Transformer, Rencos, and SeqGNN.
A new CCSD benchmark (C language) is released for code summarization.

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。