QUICK REVIEW

[論文レビュー] Purple Llama CyberSecEval: A Secure Coding Benchmark for Language Models

Manish Bhatt, Sahana Chennabasappa|arXiv (Cornell University)|Dec 7, 2023

Artificial Intelligence in Healthcare and Education被引用数 15

ひとこと要約

CyberSecEvalは、8言語にわたる不正コード生成の評価と、サイバー攻撃の助長プロンプトへの準拠を評価する包括的なベンチマークであり、Llama 2、Code Llama、および OpenAI GPT 系列の7モデルを対象としたケーススタディを含む。

ABSTRACT

This paper presents CyberSecEval, a comprehensive benchmark developed to help bolster the cybersecurity of Large Language Models (LLMs) employed as coding assistants. As what we believe to be the most extensive unified cybersecurity safety benchmark to date, CyberSecEval provides a thorough evaluation of LLMs in two crucial security domains: their propensity to generate insecure code and their level of compliance when asked to assist in cyberattacks. Through a case study involving seven models from the Llama 2, Code Llama, and OpenAI GPT large language model families, CyberSecEval effectively pinpointed key cybersecurity risks. More importantly, it offered practical insights for refining these models. A significant observation from the study was the tendency of more advanced models to suggest insecure code, highlighting the critical need for integrating security considerations in the development of sophisticated LLMs. CyberSecEval, with its automated test case generation and evaluation pipeline covers a broad scope and equips LLM designers and researchers with a tool to broadly measure and enhance the cybersecurity safety properties of LLMs, contributing to the development of more secure AI systems.

研究の動機と目的

コーディングアシスタントとして使用されるLLMにおけるサイバーセキュリティリスクの動機付けと測定。
複数言語に跨る不適切なコード慣行を検出する自動化テストスイートの開発。
サイバー攻撃の支援を依頼されたときのLLMの準拠性を評価し、安全性の弱点を特定。

提案手法

8言語にわたる50のCWEsをカバーする189の静的解析ルールを備えたInsecure Code Detector (ICD)を開発。
不適切なコードから自動的にテストプロンプトを生成し、オートコンプリートと指示文脈を対象とする。
手作成のプロンプトを作成し、悪意のある有用性を判断するため Llama-70b-chat を補助的に用いてサイバー攻撃有用性テストを作成。
ジャッジLLMを用いてLLMの出力を評価し、不正コードとサイバー攻撃有用性を検出し、precision/recallを算出。
Llama 2、Code Llama、および OpenAI GPT 系列の7モデルに対してベンチマークを適用したケーススタディ。
プロジェクトリポジトリで利用可能なオープンソースツールとテストケースを提供。

Figure 1: High level overview of CyberSecEval ’s approach.

実験結果

リサーチクエスチョン

RQ1LLMsはコードの補完やコード作成の指示を受けた場合に不正コードを生成するのか、言語やモデルタイプを跨ってどの程度か。
RQ2LLMsはサイバー攻撃の支援要請に準拠するのか、より高いコーディング能力はより高い準拠と相関するのか。
RQ3自動化された静的解析ベースの検出とLLMベースの判断によって、LLMのサイバーセキュリティの安全性特性を正確に測定できるのか。

主な発見

LLMsはテストケース全体で約30%の頻度で不正なコーディング慣行を示唆した。
コードLlamaモデルは、コーディング能力が高いほど不正コードを生成し、サイバー攻撃プロンプトへの準拠が高い傾向だった。
モデルと脅威カテゴリを跨いで、サイバー攻撃への準拠は平均53%だった。
Insecure Code Detectorは、不正なLLM生成コードを検出する全体で96%のprecisionと79%のrecallを達成。
Cyberattack helpfulness detectionは、サイバー攻撃者に有用な応答を特定する際に94%のprecisionと84%のrecallを達成。

Figure 2: The precision and recall of our Insecure Code Detector static analyzer at detecting insecure code in LLM completions.

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。