[论文解读] Unpacking Security Scanners for GitHub Actions Workflows
系统性比较 9 个最先进的静态分析工具用于 GitHub Actions 工作流,将它们的规则映射到 10 个弱点的共同分类法,并在 596 个真实世界工作流的数据集上评估覆盖率、一致性和性能。
GitHub Actions is a widely used platform to automate the build and deployment of software projects through configurable workflows. As the platform's popularity grows, it also becomes a target of choice for software supply chain attacks. These attacks exploit excessive permissions, ambiguous versions or the absence of artifact integrity checks to compromise the workflows. In response to these attacks, several security scanners have emerged to help developers harden their workflows. In this paper, we perform the first systematic comparison of 9 GitHub Actions Workflows security scanners. We compare them regarding scope (which security weaknesses they target), detection capabilities (how many weaknesses they detect), and performance (how long they take to scan a workflow). In order to compare the scanners on a common ground, we first establish a classification of 10 common security weaknesses that can be found in GitHub Actions Workflows. Then, we run the scanners against a curated set of 2722 workflows. Our study reveals that the landscape of GitHub Actions Workflows security scanners is very diverse, with both general purpose and focused scanners. More importantly, we provide evidence that these scanners implement fundamentally different analysis strategies, leading to major gaps regarding the nature and the number of reported security weaknesses. Based on these empirical evidence we make actionable recommendations for developers to harden their GitHub Actions Workflows.
研究动机与目标
- 定义一个 GitHub Actions 工作流中 10 个高级安全弱点的分类法,以实现跨工具的公平比较。
- curate 一组积极维护、可本地执行的扫描器用于基准测试。
- 在来自 77 个代码库的 596 个真实世界工作流数据集上,经验性比较扫描器在覆盖、一致性和运行时方面的表现。
- 为开发者提供可操作的建议,以加强 GitHub Actions 工作流的安全性。
提出的方法
- 从初始的 30 个候选中,筛选出 9 个最先进、可本地执行的 GitHub Actions 工作流扫描器。
- 通过聚合并聚类 84 条跨工具的规则,创建 10 项弱点分类法。
- 将每个工具的检测规则映射到该分类法,以评估弱点类别的覆盖范围。
- 在来自 77 个代码库的 596 个真实世界工作流上对扫描器进行基准测试,以研究检测一致性和运行时。
- 使用 Unix time 按工作流文件测量运行时间,重复两次以确保稳定性。)
实验结果
研究问题
- RQ1RQ1:扫描器在跨工具的相同工作流弱点类别覆盖上有多大程度的相似性?
- RQ2RQ2:扫描器在相同工作流上的检测弱点的一致性有多高?
- RQ3RQ3:扫描仪的运行时特征和报告功能是什么?
主要发现
- 没有单一扫描器覆盖所有已识别的弱点类别;有些高度专业化,而有些覆盖面较广。
- 专业化和通用型扫描器相互补充,以实现全面的安全覆盖。
- 大多数扫描器足以用于 CI 集成,少数异常者会增加适度的延迟。
- Zizmor 和 poutine 在被评估工具中提供了广泛的弱点覆盖。
- 扫描器输出在对弱点的解释上存在差异,导致报告结果不同。
- 一个开放科学仓库提供对所有数据和结果的访问。
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。