QUICK REVIEW

[논문 리뷰] Detecting Cryptographically Relevant Software Packages with Collaborative LLMs

Eduard Hirsch, Kristina Raab|arXiv (Cornell University)|2026. 03. 07.

Advanced Malware Detection Techniques인용 수 0

한 줄 요약

본 논문은 다수의 로컬 모델 간 다수결 투표를 활용하여 문헌적으로 암호학적으로 관련된 소프트웨어 패키지를 휴리스틱하게 식별하는 오프라인 협업 LLM 프레임워크를 제안하며, Fedora 패키지 약 65k개를 대상으로 평가하였다.

ABSTRACT

IT systems are facing an increasing number of security threats, including advanced persistent attacks and future quantum-computing vulnerabilities. The move towards crypto-agility and post-quantum cryptography (PQC) requires a reliable inventory of cryptographic assets across heterogeneous IT environments. Due to the sheer amount of packets, it is infeasible to manually detect cryptographically relevant software. Further, static code analysis pipelines often fail to address the diversity of modern ecosystems. Our research explores the use of large language models (LLMs) as heuristic tools for cryptographic asset discovery. We propose a collaborative framework that employs multiple LLMs to assess software relevance and aggregates their outputs through majority voting. To preserve data privacy, the approach operates on-premises without reliance on external servers. Using over 65,000 Fedora Linux packages, we evaluate the reliability of this method through statistical analysis, inter-model agreement, and manual validation. Preliminary results suggest that~LLM ensembles can serve as an efficient first-pass filter for identifying cryptographic software, resulting in reduced manual workload and assisting PQC transition. The study also compares on-premises and online LLM configurations, highlighting key advantages, limitations, and future directions for automated cryptographic asset discovery.

연구 동기 및 목표

LLMs가 소프트웨어 패키지에서 암호학적 기능을 휴리스틱하게 탐지할 수 있는 방법을 식별한다.
다수의 로컬로 호스팅된 LLM을 집계하는 것이 탐지 품질을 향상시키는지 평가한다.
기업 환경에서 암호자산 발견에 적합한 오프라인 워크플로를 시연한다.
재현 가능한 암호학 자산 발견을 위한 가이드와 오픈 소스 산출물을 제공한다.

제안 방법

패키지 매니저(Fedora)에서 이름, 설명 및 1단계 종속성 정보를 포함하는 기본 패키지 목록을 수집한다.
각 패키지에 대해 암호학적 관련성을 평가하기 위해 신중하게 설계된 JSON-출력 프롬프트로 다중 로컬 LLM을 프롬프트한다.
다수결 방식으로 LLM 출력을 집계하여 최종 암호학적 관련성 판단을 도출한다.
라벨링된 샘플과 교차 검증을 통해 모델 선택 및 다수결 결과를 검증한다.
오프라인 LLM 구성과 온라인 구성의 차이를 비교하고 응답 품질과 종속성을 분석한다.

실험 결과

연구 질문

RQ1RQ1 LLM을 어떻게 활용하여 암호학적 기능을 구현하거나 의존하는 소프트웨어 패키지를 휴리스틱하게 식별할 수 있는가?
RQ2RQ2 다수의 LLM 간의 집계를 통해 암호학적 관련 판단의 품질을 향상시킬 수 있는가?

주요 결과

모델	크기	유효	무효	오류율
phi	2.1 GB	65,222	72	0.11%
deepseek	5.2 GB	65,199	95	0.15%
llama	4.4 GB	65,094	200	0.31%
mistral	3.9 GB	64,974	320	0.49%
gpt4all	6.9 GB	64,157	1,137	1.74%
agg	—	63,529	1,765	2.70%

LLM 앙상블은 암호 소프트웨어 자산을 식별하기 위한 효율적인 1차 필터로 작용할 수 있다.
다섯 개의 로컬 모델에 걸친 다수결 전략은 패키지에 대해 견고한 분류를 제공한다.
390개 패키지에 대한 수동 검증은 반복적 개선 및 모델 선택을 지원한다.
연구는 재현성 및 추가 연구를 위한 오픈 소스 코드와 데이터를 제공한다.
오프라인(온-프레미스) LLM 구성은 PQC 관련 자산 발견에 실용적 가능성을 보여준다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.