QUICK REVIEW

[论文解读] Applying Formal Methods Tools to an Electronic Warfare Codebase (Experience report)

Letitia W. Li, Denley Lam|arXiv (Cornell University)|Jan 16, 2026

Software Testing and Debugging Techniques被引用 0

一句话总结

该论文整理了开源的 C/C++ 形式方法工具，比较它们在 EW 系统漏洞覆盖方面的表现，并报告可用性障碍以及与实际 CI/CD 工作流集成的情况，给出实用建议。

ABSTRACT

While using formal methods offers advantages over unit testing, their steep learning curve can be daunting to developers and can be a major impediment to widespread adoption. To support integration into an industrial software engineering workflow, a tool must provide useful information and must be usable with relatively minimal user effort. In this paper, we discuss our experiences associated with identifying and applying formal methods tools on an electronic warfare (EW) system with stringent safety requirements and present perspectives on formal methods tools from EW software engineers who are proficient in development yet lack formal methods training. In addition to a difference in mindset between formal methods and unit testing approaches, some formal methods tools use terminology or annotations that differ from their target programming language, creating another barrier to adoption. Input/output contracts, objects in memory affected by a function, and loop invariants can be difficult to grasp and use. In addition to usability, our findings include a comparison of vulnerabilities detected by different tools. Finally, we present suggestions for improving formal methods usability including better documentation of capabilities, decreased manual effort, and improved handling of library code.

研究动机与目标

评估面向用 C++ 编写的安全关键 EW 代码库的形式方法的可行性。
调研具有形式基础的开源 C++ 验证工具，并按方法与可用性进行分类。
识别哪些工具解决 EW 系统的漏洞类别。
评估在工业工作流中采用基于注释的工具与静态分析工具的挑战。
提供可操作的建议，以提升工具可用性与在开发流程中的集成。

提出的方法

通过将底层形式方法（如 SAT 求解、抽象解释、Hoare 逻辑）进行分类，评估适用于 C/C++ 的开源形式方法工具。
评估工具的可用性方面，包括所需注释、集成工作量和维护状态。
将所选工具应用于真实的 EW 代码库，以评估对指定安全属性（内存、并发、解析等）的覆盖。
比较不同工具检测出的漏洞并分析误报与差异。
评估 CI/CD 集成需求和性能影响。
提出改进工具可用性、文档和对库代码处理的建议。

实验结果

研究问题

RQ1哪些开源形式方法工具覆盖了 EW 代码库中已识别的漏洞类别（内存安全、并发、解析等）？
RQ2对于缺乏形式方法培训的一般开发者，这些工具的可用性如何？
RQ3在工业工作流中会遇到哪些障碍（注释负担、文档、库代码处理等），如何缓解？
RQ4在 EW 场景下，不同工具在覆盖范围和误报方面的结果有何差异？

主要发现

Tool	Type of Formal Method	Type of Interaction	Last update	Security Classes
CBMC	SAT solving	Command line interface with compilation	07/09/2025	Memory leak, Overflow, Dangling pointer
ESBMC	SAT solving	Command line interface	10/12/2025	Memory leak, Overflow, Dangling pointer, Deadlock, Race condition
Clang Analyzer	Symbolic execution	Command line interface with compilation	10/07/2025	Memory leak, Overflow, Dangling pointer, Unsafe function, Deadlock
CN	Separation logic	Code annotation	10/08/2025	Memory leak, Overflow, Dangling pointer
Crux	Symbolic execution	Code annotation	03/24/2025	Overflow
Faial	SMT	Command line interface or Git action	04/13/2024	Race condition
Frama-Clang	Hoare logic (WP Plugin), Abstract interpretation (Eva Plugin)	Code annotation	06/25/2025	Memory leak, Overflow, Dangling pointer, Deadlock, Race condition
IKOS	Abstract interpretation	Command line interface	12/31/2024	Overflow, Dangling pointer
Infer	Separation logic	Command line interface, Code annotation	06/21/2024	Memory leak, Overflow, Dangling pointer, Deadlock, Race condition

没有单一工具覆盖所有目标漏洞；不同工具处理不同类别（内存泄漏、溢出、悬空指针、死锁、竞争条件等）。
基于注释的工具提供较强的性质验证，但需要大量人工注释，增加采用难度。
静态分析工具提供低成本的检查，但覆盖范围不一且不同工具结果不一致，尤其是库代码。
由于库代码，结果可能带来噪声，需要筛选或对源代码与库代码进行假设区分。
标准化术语与完善文档将简化工具比较与采用。
自动生成注释可减少人工工作量，并提升工程师的实用性。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。