QUICK REVIEW

[論文レビュー] Lemma Discovery in Agentic Program Verification

Huan Zhao, Haoxin Tu|arXiv (Cornell University)|Mar 23, 2026

Logic, programming, and type systems被引用数 0

ひとこと要約

The paper introduces LemmaNet, an LLM-based agent that discovers helper lemmas offline from source code and online during proving to bridge semantics-aware and proof-targeted verification conditions, improving deductive verification on real-world software.

ABSTRACT

Deductive verification provides strong correctness guarantees for code by extracting verification conditions (VCs) and writing formal proofs for them. The expertise-intensive task of VC proving is the main bottleneck in this process, and has been partly automated owing to recent advances in Large Language Model (LLM) agents. However, existing proof agents are not able to discover helper lemmas - auxiliary lemmas that aid in proving - and thus fall short as programs grow in size and complexity. In this paper, we argue that VC proving for program verification is more than a purely mathematical task, and benefits considerably from program comprehension. Our key insight is that human-proof engineers often discover and apply helper lemmas based on their understanding of the program semantics, which are not directly reflected in the VCs produced by VC generators. Inspired by this insight, we propose an LLM agent, LemmaNet, that discovers helper lemmas in two ways. Specifically, the agent first synthesizes lemmas offline by directly analyzing the source code and specifications, and then relating this semantic understanding to the mechanical, verbose encoding produced by VC generators. As the proof unfolds, LemmaNet then adapts existing helper lemmas online to accommodate evolving proof states, enabling the agent to effectively discharge complex VCs on-the-fly. We evaluate LemmaNet on SV-COMP and established real-world subjects, including modules of the Linux kernel, Contiki OS, standard C++ library, and X.509 parser. Our experimental results demonstrate that LemmaNet significantly outperforms state-of-the-art approaches, highlighting the importance of program comprehension-aided lemma discovery in agentic program verification.

研究の動機と目的

プログラム理解を通じたヘルパー補題の発見が検証条件の discharge に寄与することを実証する。
オフラインの補題合成とオンラインの補題適応を組み合わせた二段階のLemmaNetアーキテクチャを提案する。
意味論認識的VCと証明対象VCを統合することで、複雑な検証条件のdischargeを改善することを示す。
実世界ソフトウェアとベンチマークでLemmaNetを評価し、有効性と頑健性を確立する。

提案手法

オフライン補題合成はソースコードと仕様を分析し、意味論認識的なVCと証明対象VCを橋渡しする意味論整合ヘルパー補題を生成する。
オンライン補題アダプタは証明支援システムからのフィードバックに基づき、証明中にヘルパー補題を洗練・適応させる。
既存の戦術ごとの証明エージェントの上に構築されたシステムで、プログラム意味分析子と義務 aligned 補題合成を備える。
段階的ワークフロー：意味論認識的VCを生成する；橋渡し補題を合成する；信頼できるVCジェネレータでVCを dischargeする；証明状態の進展に合わせてオンラインで補題を適応する。

実験結果

リサーチクエスチョン

RQ1ヘルパー補題が従来のVCジェネレータが生成する検証条件の discharge を改善できるか？
RQ2オフラインの意味分析とオンライン適応は、意味論認識的VCと証明対象VCの間のギャップを橋渡しするのにどう協働できるか？
RQ3補題発見を有効にしたエージェントは実世界のソフトウェア検証タスクで最新の証明エージェントを上回るか？
RQ4どのような実世界ソフトウェア（例：Linuxカーネルモジュール、Contiki OS、標準ライブラリ）が意味論補題発見の恩恵を最も受けるか？

主な発見

LemmaNetは先端の証明エージェントAutoRocqとCopraを benchmarks で26.8%～51.7%の範囲で大きく上回る。
オフライン合成はプログラム意味に整合したヘルパー補題を導出し、意味論認識的VCと証明対象VCを橋渡しする。
オンライン補題適応は証明中の進行する状態を処理するため補題を洗練させ、複雑なVCの discharge を改善する。
評価対象はLinuxカーネルモジュール、Contiki OS、X.509パーサ、標準C++ライブラリを含む。
このアプローチはエージェント的プログラム検証における補題発見のためのプログラム理解の重要性を強調する。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。