QUICK REVIEW

[論文レビュー] Detecting Cryptographically Relevant Software Packages with Collaborative LLMs

Eduard Hirsch, Kristina Raab|arXiv (Cornell University)|Mar 7, 2026

Advanced Malware Detection Techniques被引用数 0

ひとこと要約

この論文は、複数のローカルモデル間の多数決を用いたオフラインの協調型大規模言語モデル（LLM）フレームワークを提案し、暗号解読に関連するソフトウェアパッケージをヒューリスティックに特定する。約65k個のFedoraパッケージで評価。

ABSTRACT

IT systems are facing an increasing number of security threats, including advanced persistent attacks and future quantum-computing vulnerabilities. The move towards crypto-agility and post-quantum cryptography (PQC) requires a reliable inventory of cryptographic assets across heterogeneous IT environments. Due to the sheer amount of packets, it is infeasible to manually detect cryptographically relevant software. Further, static code analysis pipelines often fail to address the diversity of modern ecosystems. Our research explores the use of large language models (LLMs) as heuristic tools for cryptographic asset discovery. We propose a collaborative framework that employs multiple LLMs to assess software relevance and aggregates their outputs through majority voting. To preserve data privacy, the approach operates on-premises without reliance on external servers. Using over 65,000 Fedora Linux packages, we evaluate the reliability of this method through statistical analysis, inter-model agreement, and manual validation. Preliminary results suggest that~LLM ensembles can serve as an efficient first-pass filter for identifying cryptographic software, resulting in reduced manual workload and assisting PQC transition. The study also compares on-premises and online LLM configurations, highlighting key advantages, limitations, and future directions for automated cryptographic asset discovery.

研究の動機と目的

LLMsがソフトウェアパッケージの暗号機能をヒューリスティックに検出できるかを明らかにする。
複数のローカルにホストされたLLMを集約することで検出品質が向上するかを評価する。
企業環境での暗号資産発見に適したオフラインワークフローを実証する。
再現性のある暗号資産発見のためのガイダンスとオープンソースアーティファクトを提供する。

提案手法

Fedora のパッケージマネージャから名前・説明・1次依存関係を含む基礎パッケージリストを収集する。
各パッケージの暗号関連性を評価するために、慎重に設計されたJSON出力プロンプトを使用して複数のローカルLLMを呼び出す。
多数決スキームを介してLLM出力を集約し、最終的な暗号関連性の判断を生成する。
ラベル付きサンプルとクロスバリデーションを用いてモデルの選択と多数決結果を検証する。
オフラインのLLM構成をオンライン構成と比較し、応答品質と依存関係を分析する。

実験結果

リサーチクエスチョン

RQ1RQ1 LLMをどのように活用して、暗号機能を実装または依存するソフトウェアパッケージをヒューリスティックに特定できるか。
RQ2RQ2 複数のLLM間の集約は、暗号関連性判断の品質を改善できるか。

主な発見

model	size	valid	invalid	error-rate
phi	2.1 GB	65,222	72	0.11%
deepseek	5.2 GB	65,199	95	0.15%
llama	4.4 GB	65,094	200	0.31%
mistral	3.9 GB	64,974	320	0.49%
gpt4all	6.9 GB	64,157	1,137	1.74%
agg	—	63,529	1,765	2.70%

LLMのエンサンブルは、暗号ソフトウェア資産を識別するための効率的な初期フィルターとして機能し得る。
5つのローカルモデルにまたがる多数決戦略は、パッケージの堅牢な分類をもたらす。
390パッケージでの手動検証は、反復的な改善とモデル選択を支援する。
本研究は再現性を可能にするオープンソースのコードとデータを提供する。
オフライン（オンプレミス）LLM設定は、PQC関連資産発見の実用的な実現性を示す。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。