QUICK REVIEW

[論文レビュー] How Vulnerable Are Edge LLMs?

Ao Ding, Hongzong Li|arXiv (Cornell University)|Mar 25, 2026

Adversarial Robustness in Machine Learning被引用数 0

ひとこと要約

要約: 論文は、エッジ展開された LLMs における量子化がクエリベースの知識抽出を妨げないことを示し、抽出効率を制限されたクエリ予算の下で改善するクラスタ型命令照会フレームワークである CLIQ を導入する。量子化された Qwen モデルで実証。

ABSTRACT

Large language models (LLMs) are increasingly deployed on edge devices under strict computation and quantization constraints, yet their security implications remain unclear. We study query-based knowledge extraction from quantized edge-deployed LLMs under realistic query budgets and show that, although quantization introduces noise, it does not remove the underlying semantic knowledge, allowing substantial behavioral recovery through carefully designed queries. To systematically analyze this risk, we propose \textbf{CLIQ} (\textbf{Cl}ustered \textbf{I}nstruction \textbf{Q}uerying), a structured query construction framework that improves semantic coverage while reducing redundancy. Experiments on quantized Qwen models (INT8/INT4) demonstrate that CLIQ consistently outperforms original queries across BERTScore, BLEU, and ROUGE, enabling more efficient extraction under limited budgets. These results indicate that quantization alone does not provide effective protection against query-based extraction, highlighting a previously underexplored security risk in edge-deployed LLMs.

研究の動機と目的

量子化されたエッジ展開 LLM が現実的なクエリ予算の下で行動知識を漏らすかを評価すること。
意味的カバレッジを最大化し、冗長性を最小化する構造化 querying フレームワークを開発すること。
クラスター基盤の命令クエリ（CLIQ）を用いた効率的抽出を、限定クエリで実証すること。
量子化レベルとモデルサイズに跨る抽出効率を評価すること。
オンデバイス LLM 展開のセキュリティ影響と保護策に関する洞察を提供すること。

提案手法

候補命令クエリを意味的クラスターに整理する CLIQ（Clustered Instruction Querying）を提案すること。
文埋め込みと MiniBatchKMeans を用いてクエリをクラスタリングし、意味的領域を作成すること。
強力な LLM のクラスタ条件付き prompting によってクラスタ対応の代表的クエリを生成すること。
情報漏洩とモデル挙動再現を定量化するためにクエリ応答ペアで学生モデルを訓練すること。
固定クエリ予算（例: 1000 クエリ）の下で CLIQ を Original Queries と比較すること（INT8/INT4 量子化の教師 model と student model）。
抽出品質を評価する指標として BERTScore、BLEU、ROUGE を用いること。

Figure 1: Overview of the proposed framework for query-based knowledge extraction from edge-deployed quantized LLMs. Previous approaches (blue) rely on unstructured queries, which often lead to redundant probing and noisy responses, resulting in low-fidelity reconstruction of model behavior. CLIQ (r

実験結果

リサーチクエスチョン

RQ1量子化されたエッジ LLM は、限られたクエリベースの対話を通じて抽出可能な意味論的知識を保持しているか。
RQ2構造化クエリの構築は、エッジ展開の制約下で naïve querying と比べて抽出効率を改善するか。
RQ3異なる量子化レベル（INT8 vs INT4）は、クエリを通じてエッジモデルの挙動の学習可能性にどのように影響するか。
RQ4クラスタ対応 querying が再構成品質とサンプル効率に与える影響はどのようか。

主な発見

Method	BERT-F1	BLEU	RLsum
Original Queries	77.97	1.05	13.37
CLIQ (Ours)	84.35	2.77	17.50

CLIQ は同じクエリ予算の下で、BERT-F1、BLEU、ROUGE 指標全てで Original Queries を上回る。
1.7B INT8 量子化された student が CLIQ によって蒸留されると、より大きな teacher のパフォーマンスと同等・それを超えることがあり、構造化クエリによる効率的な知識伝達を示す。
量子化は教師モデルの性能にわずかな劣化をもたらすが、構造化 querying は挙動抽出に依然有効である。
固定予算（例: 500 クエリ）の下で CLIQ は Original Queries より高い BERT-F1、BLEU、ROUGE-L を達成し、より速い利得と早い飽和を示す。
CLAQ の抽出効率は、クエリを 100 から 300 に増やすと急速に改善するが、それ以降は収穫逓減が見られ、サンプル効率が高いことを示す。

Figure 2: Threat framework for query-based knowledge extraction from quantized edge-deployed LLMs. Traditional extraction settings (top) assume full-precision teacher models in high-performance server environments, where abundant compute allows large-scale query probing. In contrast, edge-deployed L

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。