QUICK REVIEW

[논문 리뷰] ReSIM: Re-ranking Binary Similarity Embeddings to Improve Function Search Performance

Gianluca Capozzi, Anna Paola Giancaspro|arXiv (Cornell University)|2026. 02. 10.

Advanced Malware Detection Techniques인용 수 0

한 줄 요약

ReSIM은 임베딩 기반 이진 함수 유사성 위에 작동하는 신경 기반 재랭커를 도입하여 질의-후보 쌍을 공동 평가하고, 여러 임베딩 모델과 데이터 세트에서 Recall과 nDCG를 향상시킵니다.

ABSTRACT

Binary Function Similarity (BFS), the problem of determining whether two binary functions originate from the same source code, has been extensively studied in recent research across security, software engineering, and machine learning communities. This interest arises from its central role in developing vulnerability detection systems, copyright infringement analysis, and malware phylogeny tools. Nearly all binary function similarity systems embed assembly functions into real-valued vectors, where similar functions map to points that lie close to each other in the metric space. These embeddings enable function search: a query function is embedded and compared against a database of candidate embeddings to retrieve the most similar matches. Despite their effectiveness, such systems rely on bi-encoder architectures that embed functions independently, limiting their ability to capture cross-function relationships and similarities. To address this limitation, we introduce ReSIM, a novel and enhanced function search system that complements embedding-based search with a neural re-ranker. Unlike traditional embedding models, our reranking module jointly processes query-candidate pairs to compute ranking scores based on their mutual representation, allowing for more accurate similarity assessment. By re-ranking the top results from embedding-based retrieval, ReSIM leverages fine-grained relation information that bi-encoders cannot capture. We evaluate ReSIM across seven embedding models on two benchmark datasets, demonstrating consistent improvements in search effectiveness, with average gains of 21.7% in terms of nDCG and 27.8% in terms of Recall.

연구 동기 및 목표

함수를 독립적으로 임베딩하는 바이-인코더 BFS 시스템의 동기 부여와 한계 해결.
빠른 임베딩 검색과 크로스-엔코더 재랭커를 결합한 두 단계 함수 검색 파이프라인 제안.
쿼리-후보 쌍의 공동 처리가 임베딩 기반 검색만 사용하는 경우보다 랭킹 정확도를 향상시킨다는 것을 보여준다.
데이터셋과 도구체인에 걸친 ReSIM 접근법의 교차 모델 강건성과 이전 가능성을 입증한다.

제안 방법

임베딩 모델이 상위 w개의 후보를 검색하는 두 단계 파이프라인(윈도우 W).
신경 재랭커(크로스-엔코더)가 (query, candidate) 쌍을 공동으로 처리하여 유사도를 점수화하고 윈도우 W를 재랭킹하여 상위 k를 산출한다.
재랭커는 여러 모델에서 얻은 하드 네거티브를 사용한 페어와이즈 대조/마진 랭킹 목표로 학습된다.
파인튜닝은 LoRA 어댑터와 4-bit QLoRA 양자화가 적용된 DeepSeek-R1-Qwen3-8B (8B 매개변수)로 수행된다.
전처리는 두 개의 어셈블리 함수를 연결 및 토크나이제이션하기 전에 정규화한다.
이 접근법은 기본 임베딩 모델 φ에 대해 모델 비종속적이며 여러 φ를 앙상블할 수 있다.

실험 결과

연구 질문

RQ1다양한 BFS 임베딩 모델과 도구체인에서 ReSIM의 성능은 어떤가?
RQ2윈도우 크기 w가 ReSIM의 성능과 효율성에 어떤 영향을 미치는가?
RQ3ReSIM과 임베딩 모델 앙상블이 단일 모델 구성보다 추가 이득을 제공하는가?
RQ4어셈블리 함수 검색에 적용될 때 사전 학습된 재랭커 모델의 전이 가능 이점이 있는가?

주요 결과

ReSIM은 일곱 개 임베딩 모델과 두 데이터셋에 걸쳐 nDCG@k와 Recall@k를 일관되게 향상시킨다.
보고된 평균 향상: 평가된 설정에서 nDCG 21.7%, Recall 27.8%.
구형 임베딩 모델(예: Gemini, SAFE)에서 더 큰 이득이 나타나고, 트랜스포머 기반 모델도 주목할 만한 개선을 얻는다.
ReSIM과 임베딩 모델 앙상블은 다중 도구 체인 데이터세트에서 추가적인 Recall 증가(약 3%)를 가져온다.
사전 학습된 재랭커(DeepSeek-R1-Qwen3-8B)에서의 지식 전이 현상이 관찰되며, 비록 어셈블리 언어로 학습되지는 않았다.
ReSIM은 k 값의 범위(5,10,15,20,25,30)를 지원하며 데이터셋 전반에서 견고한 개선을 보인다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.