QUICK REVIEW

[논문 리뷰] Unpacking Security Scanners for GitHub Actions Workflows

Madjda Fares, Yogya Gamage|arXiv (Cornell University)|2026. 01. 20.

Scientific Computing and Data Management인용 수 0

한 줄 요약

시스템적으로 9개의 최첨단 정적 분석기를 비교하여 GitHub Actions 워크플로우의 규칙을 10개의 약점 카테고리로 매핑하고, 커버리지, 일관성 및 성능을 596개의 실제 워크플로우 데이터셋에서 평가한다.

ABSTRACT

GitHub Actions is a widely used platform to automate the build and deployment of software projects through configurable workflows. As the platform's popularity grows, it also becomes a target of choice for software supply chain attacks. These attacks exploit excessive permissions, ambiguous versions or the absence of artifact integrity checks to compromise the workflows. In response to these attacks, several security scanners have emerged to help developers harden their workflows. In this paper, we perform the first systematic comparison of 9 GitHub Actions Workflows security scanners. We compare them regarding scope (which security weaknesses they target), detection capabilities (how many weaknesses they detect), and performance (how long they take to scan a workflow). In order to compare the scanners on a common ground, we first establish a classification of 10 common security weaknesses that can be found in GitHub Actions Workflows. Then, we run the scanners against a curated set of 2722 workflows. Our study reveals that the landscape of GitHub Actions Workflows security scanners is very diverse, with both general purpose and focused scanners. More importantly, we provide evidence that these scanners implement fundamentally different analysis strategies, leading to major gaps regarding the nature and the number of reported security weaknesses. Based on these empirical evidence we make actionable recommendations for developers to harden their GitHub Actions Workflows.

연구 동기 및 목표

GitHub Actions 워크플로우의 10가지 고수준 보안 약점의 분류 체계를 정의하여 도구 간 공정한 비교를 가능하게 한다.
벤치마킹을 위해 활발히 유지 관리되고 로컬에서 실행 가능한 스캐너 세트를 선별한다.
대규모 실제 워크플로우 데이터셋을 사용하여 커버리지, 일관성 및 런타임 측면에서 스캐너를 실증적으로 비교한다.
개발자가 GitHub Actions 워크플로우를 강화하도록 실행 가능한 지침을 제공한다.

제안 방법

초기 후보 30개에서 9개의 최첨단 로컬 실행 가능 GitHub Actions 워크플로우 스캐너를 선별한다.
도구 전반의 84개 규칙을 모아 10항 약점 분류를 생성한다.
각 도구의 탐지 규칙을 분류표에 매핑하여 약점 클래스를 커버하는지 평가한다.
77개 리포지토리의 596개의 실제 워크플로우를 대상으로 스캐너를 벤치마크하여 탐지 일관성과 런타임을 연구한다.
안정성을 위해 각 워크플로우 파일당 Unix time으로 런타임을 측정하고 두 차례 반복한다.

실험 결과

연구 질문

RQ1RQ1: 도구 간에 동일한 워크플로우 약점 클래스에 대해 스캐너가 얼마나 비슷하게 커버하는가?
RQ2RQ2: 동일한 워크플로우에서 스캐너가 약점을 탐지하는 일관성은 어느 정도인가?
RQ3RQ3: 스캐너의 런타임 특성 및 보고 기능은 무엇인가?

주요 결과

하나의 스캐너가 모든 식별된 약점 클래스를 다 커버하지는 않으며, 일부는 고도로 특화되어 있고 다른 일부는 범위가 넓다.
전문화된 스캐너와 범용 스캐너는 포괄적인 보안 커버리지를 위해 보완적이다.
대부분의 스캐너는 CI 통합에 충분히 빠르며, 몇몇은 약간의 지연을 유발하는 이례적인 사례가 있다.
Zizmor와 poutine은 평가된 도구들 중 광범위한 약점 커버리지를 제공한다.
스캐너의 약점 해석이 다르게 되어 보고된 발견에 차이를 초래한다.
오픈 사이언스 저장소가 모든 데이터와 결과에 접근 가능하게 한다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.