QUICK REVIEW

[논문 리뷰] Automatically Mitigating Vulnerabilities in Binary Programs via Partially Recompilable Decompilation

Pemma Reiter, Hui Jun Tay|arXiv (Cornell University)|2022. 02. 24.

Advanced Malware Detection Techniques인용 수 2

한 줄 요약

이 논문은 부분 재컴파일 가능한 역분석(PRD)이라는 새로운 방법을 제안한다. PRD는 바이너리 프로그램에서 취약한 함수들만 추출하여 재컴파일 가능한 소스 코드로 역분석함으로써, 소스 수준의 수리 도구를 이용한 자동 패치 적용을 가능하게 한다. PRD는 테스트 등가성 92.9%를 달성했으며, 개별 함수의 경우 70–89%의 성공률로 역분석을 수행했으며, 전체 바이너리 역분석(1.7% 성공률)에 비해 뛰어난 성능을 보였다. 이는 CGC 바이너리에 대해 APR 도구가 전체 소스 도구와 동등한 성능을 낼 수 있도록 한다.

ABSTRACT

Decompilation is the process of translating compiled code into high-level code. Control flow recovery is a challenging part of the process. "Misdecompilations" can occur, whereby the decompiled code does not accurately represent the semantics of the compiled code, despite it being syntactically valid. This is problematic because it can mislead users who are trying to reason about the program. We present CFG-based program generation: a novel approach to randomised testing that aims to improve the control flow recovery of decompilers. CFG-based program generation involves randomly generating control flow graphs (CFGs) and paths through each graph. Inspired by prior work in the domain of GPU computing, (CFG, path) pairs are "fleshed" into test programs. Each program is decompiled and recompiled. The test oracle verifies whether the actual runtime path through the graph matches the expected path. Any difference in the execution paths after recompilation indicates a possible misdecompilation. A key benefit of this approach is that it is largely independent of the source and target languages in question because it is focused on control flow. The approach is therefore applicable to numerous decompilation settings. The trade-off resulting from the focus on control flow is that misdecompilation bugs that do not relate to control flow (e.g. bugs that involve specific arithmetic operations) are out of scope. We have implemented this approach in FuzzFlesh, an open-source randomised testing tool. FuzzFlesh can be easily configured to target a variety of low-level languages and decompiler toolchains because most of the CFG and path generation process is language-independent. At present, FuzzFlesh supports testing decompilation of Java bytecode, .NET assembly and x86 machine code. In addition to program generation, FuzzFlesh also includes an automated test-case reducer that operates on the CFG rather than the low-level program, which means that it can be applied to any of the target languages. We present a large experimental campaign applying FuzzFlesh to a variety of decompilers, leading to the discovery of 12 previously-unknown bugs across two language formats, six of which have been fixed. We present experiments comparing our generic FuzzFlesh tool to two state-of-the-art decompiler testing tools targeted at specific languages. As expected, the coverage our generic FuzzFlesh tool achieves on a given decompiler is lower than the coverage achieved by a tool specifically designed for the input format of that decompiler. However, due to its focus on control flow, FuzzFlesh is able to cover sections of control flow recovery code that the targeted tools cannot reach, and identify control flow related bugs that the targeted tools miss.

연구 동기 및 목표

소스 코드가 제공되지 않는 경우, 특히 배포 후 바이너리에서 소프트웨어 취약점을 패치하는 데 도전하는 문제를 해결하기 위해.
스케일러비리티 및 재컴파일 문제로 인해 전체 바이너리 역분석이 실패하는 데 기인한 한계를 극복하기 위해.
부분적이고 재컴파일 가능한 역분석을 통해 고정밀도 소스 수준의 자동 프로그램 수리(APR) 도구를 바이너리 프로그램에 적용할 수 있도록 하기 위해.
소수의 함수에서만 역분석된 소스 코드가 효과적이고 테스트 등가성 있는 바이너리 패치를 지원할 수 있는지 검증하기 위해.
PRD가 바이너리 입력에서 전체 소스 도구와 동등한 성능을 낼 수 있도록 APR 도구를 활용할 수 있도록 하는지 보여주기 위해.

제안 방법

취약점이 포함될 가능성이 높은 함수들을 식별하기 위해 바이너리 결함 로컬라이제이션(CGFL)을 사용한다.
의심스러운 함수들만 고수준의 재컴파일 가능한 C/C++ 소스 코드로 업그레이드하기 위해 디컴파일러를 적용하며, 타입 복구 및 함수 경계에 중점을 둔다.
원본 바이너리와의 실행 의미를 유지하기 위해, 역분석된 소스 코드와 바이너리 간의 인터페이스를 구성한다.
소스 수준의 APR 도구(예: Prophet, GenProg)를 적용하여 역분석된 소스 코드에서 패치를 생성한다.
패치된 소스 코드를 원본 바이너리에 통합하기 위해 바이너리 리라이팅 및 재컴파일 기법을 사용하며, 테스트 등가성을 확보한다.
완전하고 타당한 타입 추론에 의존도를 줄이기 위해, 오직 오프셋과 참조된 타입만 복구하는 최소한의 타입 추론을 활용한다.

실험 결과

연구 질문

RQ1바이너리에서 개별 함수를 역분석할 때, 문법 제약이나 컴파일 제약 없이 재컴파일 가능한 소스 코드를 생성할 수 있는가?
RQ2부분적 역분석을 통해 소스 수준의 APR 도구가 바이너리 프로그램에 얼마나 효과적으로 적용될 수 있는가?
RQ3PRD가 원본 바이너리와 패치된 바이너리 간의 동작 등가성을 얼마나 잘 유지하는가?
RQ4PRD가 APR 도구가 바이너리에서 전체 소스 코드를 대상으로 할 때와 동등한 성능을 낼 수 있도록 할 수 있는가?
RQ5PRD는 실제 바이너리, 다양한 프로그래밍 언어(C/C++), 그리고 다양한 취약점 유형에 대해 얼마나 일반화 가능한가?

주요 결과

충분한 타입 복구가 이루어진 경우, PRD는 개별 함수의 70–89%를 성공적으로 역분석하고 재컴파일했으며, 전체 C-바이너리의 경우 단 1.7%의 성공률에 그쳤다.
역분석이 성공한 경우, PRD는 92.9%의 비율로 테스트 등가성 바이너리를 생성하여 행동의 일관성을 확인했다.
PRD와 통합된 APR 도구는 DARPA CGC 바이너리의 148개 취약점 중 85개를 완화했으며, 전체 소스 APR 도구의 성능과 동등하거나 이를 초월했다.
PRD를 통해 활성화된 APR 도구는 종종 최고의 CGC 팀이 생성한 패치보다 더 높은 품질의 패치를 생성했으며, 이는 수리 품질의 경쟁력을 입증한다.
이 방법은 CGC, Rode0Day, MITRE CVE 등의 데이터셋에 걸쳐 일반화 가능하며, C++ 및 스트립드된 바이너리(디컴파일러 지원 시)를 지원한다.
함수 오프셋과 참조된 타입에만 집중함으로써 완전한 타입 추론에 의존도를 줄여, 더 스케일러블하고 실용적인 방법이 되었다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.