QUICK REVIEW

[论文解读] Understanding and Finding JIT Compiler Performance Bugs

Zijian Yi, Cheng Ding|arXiv (Cornell University)|Mar 6, 2026

Software System Performance and Reliability被引用 0

一句话总结

该论文在四个引擎上实证研究了 JIT 编译器性能 Bug，提出用于分层差异性能测试的 Jittery，并报告通过该工具发现的 12 个新 Bug（11 个已确认，6 个已修复）。

ABSTRACT

Just-in-time (JIT) compilers are key components for many popular programming languages with managed runtimes (e.g., Java and JavaScript). JIT compilers perform optimizations and generate native code at runtime based on dynamic profiling data, to improve the execution performance of the running application. Like other software systems, JIT compilers might have software bugs, and prior work has developed a number of automated techniques for detecting functional bugs (i.e., generated native code does not semantically match that of the original code). However, no prior work has targeted JIT compiler performance bugs, which can cause significant performance degradation while an application is running. These performance bugs are challenging to detect due to the complexity and dynamic nature of JIT compilers. In this paper, we present the first work on demystifying JIT performance bugs. First, we perform an empirical study across four popular JIT compilers for Java and JavaScript. Our manual analysis of 191 bug reports uncovers common triggers of performance bugs, patterns in which these bugs manifest, and their root causes. Second, informed by these insights, we propose layered differential performance testing, a lightweight technique to automatically detect JIT compiler performance bugs, and implement it in a tool called Jittery. We incorporate practical optimizations into Jittery such as test prioritization, which reduces testing time by 92.40% without compromising bug-detection capability, and automatic filtering of false-positives and duplicates, which substantially reduces manual inspection effort. Using Jittery, we discovered 12 previously unknown performance bugs in the Oracle HotSpot and Graal JIT compilers, with 11 confirmed and 6 fixed by developers.

研究动机与目标

评估跨主流引擎（HotSpot、Graal、V8、SpiderMonkey）的真实世界 JIT 编译器性能 Bug。
表征触发条件、症状与根本原因，以指导测试与调试。
开发并评估基于分层差异测试的轻量级检测工具（Jittery）。
提供公开可获得的 JIT 性能 Bug 数据集，以帮助未来研究。

提出的方法

对来自四个 JIT 引擎（HotSpot、Graal、V8、SpiderMonkey）的 191 个 Bug 报告进行人工、深入分析。
识别性能 Bug 的常见触发条件、模式与根本原因。
设计并实现 Jittery，一种分层差异性能测试工具，通过对比多个小型程序的两种 JIT 配置来进行测试。
结合测试优先级排序和剪枝等优化，降低测试时间和误报率。
通过在真实引擎上运行 Jittery 进行评估，发现此前未知的 Bug（共 12 个；11 个已确认；6 个被开发者修复）。
在项目仓库公开数据集和脚本。

实验结果

研究问题

RQ1RQ1: 输入产物会触发 JIT 性能 Bug？
RQ2RQ2: JIT 性能 Bug 的常见症状有哪些？
RQ3RQ3: JIT 性能 Bug 的常见根本原因是什么？

主要发现

近一半的 Bug 可以通过小型微基准而非完整基准来暴露。
Bug 常通过对比信号（如性能回归或等效执行之间的差异）来被检测到。
JIT 特有的特性（如推测与运行时交互）是超越传统优化问题的重要 Bug 源。
通过分层测试和优先级排序，Jittery 将测试时间降低 92.40%，同时不牺牲 Bug 检出能力。
利用 Jittery，作者在 Oracle HotSpot 与 Graal JIT 编译器中发现了 12 个此前未知的性能 Bug；11 个已确认，6 个被开发者修复。
本研究提供了一个公开的 JIT 性能 Bug 数据集，便于未来研究。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。