Skip to main content
QUICK REVIEW

[论文解读] ReSIM: Re-ranking Binary Similarity Embeddings to Improve Function Search Performance

Gianluca Capozzi, Anna Paola Giancaspro|arXiv (Cornell University)|Feb 10, 2026
Advanced Malware Detection Techniques被引用 0
一句话总结

ReSIM 引入一个神经排序器,建立在基于嵌入的二值函数相似性之上,能够联合评估查询-候选对,在多个嵌入模型与数据集上提升 Recall 与 nDCG。

ABSTRACT

Binary Function Similarity (BFS), the problem of determining whether two binary functions originate from the same source code, has been extensively studied in recent research across security, software engineering, and machine learning communities. This interest arises from its central role in developing vulnerability detection systems, copyright infringement analysis, and malware phylogeny tools. Nearly all binary function similarity systems embed assembly functions into real-valued vectors, where similar functions map to points that lie close to each other in the metric space. These embeddings enable function search: a query function is embedded and compared against a database of candidate embeddings to retrieve the most similar matches. Despite their effectiveness, such systems rely on bi-encoder architectures that embed functions independently, limiting their ability to capture cross-function relationships and similarities. To address this limitation, we introduce ReSIM, a novel and enhanced function search system that complements embedding-based search with a neural re-ranker. Unlike traditional embedding models, our reranking module jointly processes query-candidate pairs to compute ranking scores based on their mutual representation, allowing for more accurate similarity assessment. By re-ranking the top results from embedding-based retrieval, ReSIM leverages fine-grained relation information that bi-encoders cannot capture. We evaluate ReSIM across seven embedding models on two benchmark datasets, demonstrating consistent improvements in search effectiveness, with average gains of 21.7% in terms of nDCG and 27.8% in terms of Recall.

研究动机与目标

  • Motivating 与解决独立对函数进行嵌入的双编码器 BFS 系统的局限性。
  • 提出一个两阶段的函数搜索流水线,结合快速嵌入检索与跨编码器重新排序。
  • 证明对查询-候选对的联合处理在排序准确性上优于仅基于嵌入的检索。
  • 证明 ReSIM 方法在跨数据集与工具链上的模型鲁棒性与可迁移性。

提出的方法

  • 一个两阶段流水线,其中嵌入模型检索前 W 个候选(窗口大小 Window W)。
  • 一个神经重新排序器(跨编码器)对(query, candidate)对联合处理以评分相似性并对 W 进行重排序,产生前 k 名结果。
  • 重新排序器以对比学习/边际排序的对比损失目标进行训练,使用来自多模型的困难负样本。
  • 微调使用 DeepSeek-R1-Qwen3-8B(8B 参数),配合 LoRA 适配器与4-bit QLoRA 量化。
  • 预处理在连接与分词以供跨编码器输入前对两个汇编函数进行归一化处理。
  • 该方法与底层嵌入模型 φ 无关,且可对多个 φ 进行集成。

实验结果

研究问题

  • RQ1ReSIM 在多样化的 BFS 嵌入模型与工具链中表现如何?
  • RQ2窗口大小 w 如何影响 ReSIM 的性能与效率?
  • RQ3与单一模型设置相比,嵌入模型集成与 ReSIM 的组合是否带来额外收益?
  • RQ4在汇编函数检索任务中,预训练的重新排序模型是否具有可迁移的收益?

主要发现

  • ReSIM 在七个嵌入模型与两个数据集上持续提升 nDCG@k 与 Recall@k。
  • 平均增益:在 nDCG 上提升 21.7%,在 Recall 上提升 27.8%,覆盖评估设置。
  • 较旧的嵌入模型(如 Gemini、SAFE)显示更大增益;基于 transformer 的模型也受益,取得显著改善。
  • 在多工具链数据集上,与 ReSIM 一起对嵌入模型进行集成可额外提升 Recall(约 3%)。
  • 尽管对预训练的重新排序模型(DeepSeek-R1-Qwen3-8B)未在汇编语言上进行训练,仍可观察知识转移的效应。
  • ReSIM 支持多种 k 值(5、10、15、20、25、30),在不同数据集上均表现出稳健的改进。

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。