QUICK REVIEW

[論文レビュー] Using LLMs to Facilitate Formal Verification of RTL

Marcelo Orenes-Vera, Margaret Martonosi|arXiv (Cornell University)|Sep 18, 2023

Formal Methods in Verification被引用数 13

ひとこと要約

本論文は、事前定義された仕様なしでRTLから正しいSystemVerilog Assertions (SVA) を生成するためにGPT-4を用いることを調査し、このフローをAutoSVAに組み込んでFPVのカバレッジを向上させ、RTL生成の支援にも寄与する可能性を検討する。

ABSTRACT

Formal property verification (FPV) has existed for decades and has been shown to be effective at finding intricate RTL bugs. However, formal properties, such as those written as SystemVerilog Assertions (SVA), are time-consuming and error-prone to write, even for experienced users. Prior work has attempted to lighten this burden by raising the abstraction level so that SVA is generated from high-level specifications. However, this does not eliminate the manual effort of reasoning and writing about the detailed hardware behavior. Motivated by the increased need for FPV in the era of heterogeneous hardware and the advances in large language models (LLMs), we set out to explore whether LLMs can capture RTL behavior and generate correct SVA properties. First, we design an FPV-based evaluation framework that measures the correctness and completeness of SVA. Then, we evaluate GPT4 iteratively to craft the set of syntax and semantic rules needed to prompt it toward creating better SVA. We extend the open-source AutoSVA framework by integrating our improved GPT4-based flow to generate safety properties, in addition to facilitating their existing flow for liveness properties. Lastly, our use cases evaluate (1) the FPV coverage of GPT4-generated SVA on complex open-source RTL and (2) using generated SVA to prompt GPT4 to create RTL from scratch. Through these experiments, we find that GPT4 can generate correct SVA even for flawed RTL, without mirroring design errors. Particularly, it generated SVA that exposed a bug in the RISC-V CVA6 core that eluded the prior work's evaluation.

研究の動機と目的

形式的性質の記述が時間がかかり、またエラーが出やすいという課題に対処する。
LLMがRTLの動作を捉え、RTLだけから正しいSVAを生成できるかを探る。
GPT-4を訓練して有効で完全なSVAプロパティを生成させるための、反復的なルール洗練ワークフローを開発する。
AutoSVAをGPT-4ベースのSVA生成フローで拡張し、複雑なRTLモジュールで評価する。

提案手法

SVAの正確さと完全性を評価するFPVベースの評価フレームワークを設計する。
RTLから構文的・意味論的に正しいSVAを生成するよう、GPT-4のプロンプトルールを反復的に洗練する。
改良されたGPT-4フローを拡張されたAutoSVAフレームワーク（AutoSVA2）に組み込み、安全性と生存性の性質を生成する。
複雑なRTLモジュール（CVA6のPTWとTLB）に対するGPT-4生成のSVAを評価し、RTLカバレージを比較する。
FPVフィードバックに導かれた反復的なRTL/SVAループを用いて、ゼロからGPT-4生成RTLを使用する。

Figure 1: FPV-based evaluation framework. The FPV tool returns whether the assertions generated by the LLM are correct or not—for a given RTL. Hinted by the errors or CEXs of the FPV report, the engineer manually writes or refines the rules that guide the LLM toward generating better SVA. The rule s

実験結果

リサーチクエスチョン

RQ1明示的な高レベルの仕様なしに、LLMはRTLから正しいSVAプロパティを生成できるか？
RQ2SVAの意味論とタイミングをLLMに教えるために、プロンプトルールをどのように設計できるか？
RQ3LLMベースのSVAフローをAutoSVAに統合することで、RTLの性質カバレージと故障検出は改善されるか？
RQ4SVAプロンプトに導かれてGPT-4がゼロからRTLを生成できるか、またFPVフィードバックはRTL品質を改善できるか？

主な発見

反復	コンパイル	プロパティ数 (#Prop)	失敗プロパティ数 (#Fail)	主な問題点
1	✗	4	-	IN: 未宣言の変数（モジュール前置子なし）
2	✗	6	-	SY: アサーションのキーワードが誤っている
3	✗	8	-	SY: foreachをアサーションループとして使用
4	✗	4	-	IN: 未宣言のbuffer_head_r
5	✗	9	-	SY: include名とassert名のエラー
6	✗	6	-	SY: 重複したアサーション名
7	✓	6	4	WT: 時間セマンティクスの誤り $\|->$
8	✗	6	3	IN: 未宣言の変数（モジュール前置子なし）
9	✗	5	-	IN: モジュール前置子; 以前のルールを無視
10	✓	9	5	WT: 時間セマンティクスの誤り $\|=>$
11	✓	7	1	WT: postcondition に $past が欠如
12	✓	10	7	WT: $past の使い過ぎ
13	✗	9	-	SY: T3からforeachルールを忘れた
14	✓	12	4	WS: wrapなしの増分; 信号が誤っている
15	✓	8	2	WT: postcondition に $past が欠如
16	✓	9	4	WT/WS: ビット演算の誤り
17	✓	8	3	WT: 時間セマンティクスの誤り $\|=>$],[	18	✗	10	0	SY: 配列名付きアサーション as_name[i]
19	✗	12	2	SY/WS: 空の前条件；誤ったビット演算
20	✗	10	-	SY: 定数使用の桁幅が誤っている
21	✓	7	1	WT: レジスタのための $past が欠如
22	✓	9	1	WT: 前条件の $past が不正
23	✓	8	0	完全証明
24	✓	8	1	out_rdy に関する誤った挙動を仮定

GPT-4は設計上のエラーを模倣せず、バグのあるRTLから正しいSVAを生成できる。
GPT-4を導くルール集合を洗練させるにつれてSVAの品質が向上し、T23のFIFOモジュールで23回の反復後に完全な構文正確性を達成した。
AutoSVA2はAutoSVA単独よりもRTLの挙動カバレージを大幅に向上させ、特定のモジュールでトグルカバレージが最大6倍改善した。
複数バッチのGPT-4生成SVAを使用するとRTLカバレージが向上する（例: PTW: 六回のバッチで約1.25倍の命題カバレージ、TLB: 約6倍）。
AutoSVAのアサーションとGPT-4生成アサーションの組み合わせは、いくつかのモジュールで補完的なカバレージを提供する。

Figure 2: Overview of AutoSVA2. Our additions to the original AutoSVA flow are shown with thick boxes and arrows; the original flow is shown with thin boxes and arrows. The green boxes indicate automatically generated artifacts. The green arrows indicate the SVA generation flow and the blue arrows t

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。