QUICK REVIEW

[論文レビュー] LLM-Powered Code Vulnerability Repair with Reinforcement Learning and Semantic Reward

Nafis Tanveer Islam, Joseph Khoury|arXiv (Cornell University)|Jan 7, 2024

Software Engineering Research被引用数 5

ひとこと要約

SecRepair は CodeGen2-7B を用い、強化学習と意味的報酬で自動的に脆弱性を特定・修復・説明し、簡潔なコードコメントを生成するほか、InstructVul という命令ベースの脆弱性データセットを提供します。

ABSTRACT

In software development, the predominant emphasis on functionality often supersedes security concerns, a trend gaining momentum with AI-driven automation tools like GitHub Copilot. These tools significantly improve developers' efficiency in functional code development. Nevertheless, it remains a notable concern that such tools are also responsible for creating insecure code, predominantly because of pre-training on publicly available repositories with vulnerable code. Moreover, developers are called the "weakest link in the chain" since they have very minimal knowledge of code security. Although existing solutions provide a reasonable solution to vulnerable code, they must adequately describe and educate the developers on code security to ensure that the security issues are not repeated. Therefore we introduce a multipurpose code vulnerability analysis system exttt{SecRepair}, powered by a large language model, CodeGen2 assisting the developer in identifying and generating fixed code along with a complete description of the vulnerability with a code comment. Our innovative methodology uses a reinforcement learning paradigm to generate code comments augmented by a semantic reward mechanism. Inspired by how humans fix code issues, we propose an instruction-based dataset suitable for vulnerability analysis with LLMs. We further identify zero-day and N-day vulnerabilities in 6 Open Source IoT Operating Systems on GitHub. Our findings underscore that incorporating reinforcement learning coupled with semantic reward augments our model's performance, thereby fortifying its capacity to address code vulnerabilities with improved efficacy.

研究の動機と目的

AI支援開発ツールにおける機能性を超えた安全なコード修復を促進する。
C/C++コードの脆弱性を特定・修復・説明するエンドツーエンドのシステムを開発する。
セキュリティ上の関心に合わせた命令ベースの脆弱性データセット（InstructVul）を作成する。
コミットメッセージとして適用可能な簡潔なコードコメントの生成を可能にする。
実際の OSS IoT OS におけるゼロデイおよびNデイ脆弱性解析をデモンストレーションする。

提案手法

セキュリティ分析向けに微調整された CodeGen2 ベースの LLM を活用して、脆弱性を識別・修復・説明する。
InstructVul を含む、脆弱性識別・修復・説明・コードコメント生成タスクを含む命令ベースデータセットで訓練する。
エンコーダを削除して長いコード列を可能にするように、エンコーダ-デコーダアーキテクチャを変更し、入力と出力を単一の LM 型シーケンスとして訓練する。
因果デコーダを用いて、順序性と文脈認識を確保しつつ、脆弱性の説明（コードからテキスト）を微調整する。
意味認識報酬（BERTScore ベース）と PPO を用いた強化学習を適用して、簡潔で意味を保つコードコメントを最適化する。
BLEU、Rouge-L、および人間の評価を用いて評価する。脆弱性検出にはクロスエントロピーを用い、修復品質の安定化を図る。

実験結果

リサーチクエスチョン

RQ1RQ1: システムは自動的に脆弱性を識別し、コードを正確に修復できるか。
RQ2RQ2: システムは開発者に対して包括的な脆弱性の説明を提供できるか。
RQ3RQ3: システムは説明を最適化・要約し、簡潔なコードコメントを生成できるか。

主な発見

Model	Parameter	BLEU	Rouge-L	F1	Pre.	Rec.	Acc.
Devign	<1M	0.56	0.55	0.56	0.55	0.55	0.56
VELVET	<1M	0.62	0.61	0.59	0.61	0.59	0.68
PFGCN	110M	0.64	0.64	0.61	0.64	0.61	0.62
CodeT5	770M	0.68	0.62	0.59	0.62	0.59	0.68
CodeGen2	1B	0.72	0.70	0.68	0.70	0.68	0.79
CodeGen2	3.7B	0.75	0.77	0.73	0.77	0.73	0.85
SecRepair	7B	0.82	0.80	0.70	0.80	0.70	0.88

SecRepair (7B) は、脆弱性の識別/修復タスクにおいて F1 0.82、Precision 0.80、Recall 0.70、Accuracy 0.88 を達成（表1）。
InstructVul データセットでは、SecRepair (7B) は修復タスクで BLEU 0.82 および Rouge-L 0.80 を達成し、同程度のパラメータ規模のいくつかのベースラインを上回る。
開発者向けタスクにおける脆弱性の説明品質について、SecRepair (7B) は BLEU 0.76、Rouge-L 0.98、Human スコア 5 を達成。
意味報酬を用いた強化学習は、プレーンなファインチューニングと比較してコードコメント生成を向上させる（SecRepair 7B: BLEU 0.60、Rouge-L 0.72、Human 5）。
アブレーション研究は、温度とビームサイズが性能に影響を与えることを示し、温度は約 0.5 の近傍が最適で、ビームサイズを大きくすると推論コストが高い代わりに利得が得られる、という結論。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。