QUICK REVIEW

[論文レビュー] Agentic Diagnostic Reasoning over Telecom and Datacenter Infrastructure

Nicolas Tacheny|arXiv (Cornell University)|Jan 12, 2026

Software System Performance and Reliability被引用数 0

ひとこと要約

論文は、Model Context Protocol (MCP) 経由で型付きインフラストラクチャ・オントロジーを公開するツール拡張LLMエージェントを提案し、埋め込みグラフアルゴリズムなしで根本原因分析（RCA）と影響伝播を実行でき、プロトコル駆動の診断推論を実現する。

ABSTRACT

Large-scale telecom and datacenter infrastructures rely on multi-layered service and resource models, where failures propagate across physical and logical components and affect multiple customers. Traditional approaches to root cause analysis(RCA) rely on hard-coded graph traversal algorithms or rule-based correlation engines, which are costly to maintain and tightly coupled to the infrastructure model. In this work, we introduce an agentic diagnostic framework where a Large Language Model (LLM) performs step-wise investigation using a constrained tool space exposed through the Model Context Protocol (MCP). Instead of embedding causal logic or traversal algorithms into the application, the agent autonomously navigates the infrastructure model by invoking tools for service lookup, dependency retrieval, structured and unstructured data, and event analysis, and impact discovery. We define an investigation protocol that structures the agent's reasoning and ensures grounding, reproducibility, and safe handling of missing or ambiguous information. This work lays the foundation for autonomous incident resolution and change impact mitigation. Future systems will not only diagnose and remediate infrastructure failures, but also predict the impact of planned changes on services and customers, enabling operators to mitigate risks before executing maintenance operations.

研究の動機と目的

マルチレイヤのインフラストラクチャモデル上でRCAと影響伝播を行うツール拡張エージェント枠組みを導入する。
硬直したグラフアルゴリズムを、制約されたツール相互作用を通じて推論するLLMへ置き換える。
RCA調査プロトコルとMCPインターフェースを介して、グラウンディング、再現性、安全性を確保する。
構造化されたツール呼び出しから推論が生じる実務シナリオで実現可能性を示す。

提案手法

インフラストラクチャ・オントロジーを、Service・Resource・Party・Event・NoteノードとImlements, AllocatedTo, AffectedBy, ServiceOf, HasNoteエッジを持つ型付きグラフとして定義する。
ストレージバックエンド（グラフDB、リレーショナルDB等）からの推論を分離するため、MCPツールを介してオントロジーを公開する。
インシデント抽出から根本原因および影響を受ける関係者の特定までの手順を、明示的な不確実性処理とともにシーケンスするRCA調査プロトコルを提案する。
LLMを用いて、MCPツール経由でのみ取得されるデータを用いた段階的でツール基盤の推論を実行する。
合成シナリオでツール基盤の推論を評価し、真の根本原因と影響を検証する。
ツール使用とエンティティ言及の静的トレースによって忠実性を分析し、幻覚を防止する。

実験結果

リサーチクエスチョン

RQ1LLMエージェントは、型付きインフラストラクチャ・オントロジー上で固定MCPツールインターフェースに制約された場合、RCAと影響伝播を信頼性高く実行できるか。
RQ2モデルサイズとアクセス性が、構造化されたプロトコル下で調査精度、RCA精度、影響精度にどう影響するか。
RQ3MCP抽象化がグラウンディングと安全性を維持し、幻覚やツール乱用をモデル能力の変化に関係なく防げるか。
RQ4閉じた形式のプロトコル準拠と運用診断のモデル性能の間にどのようなトレードオフがあるか。

主な発見

Model	Investigation	RCA	Impact	Avg Duration
Claude Haiku 3.5 (Anthropic)	100%	100%	100%	20.9s
Llama 3.1 8B Instant (Groq)	79%	91.1%	86.1%	3.9s
GPT-OSS-120B (Groq)	99%	100%	99%	11.6s

Claude Haiku 3.5は、調査・RCA・影響の全てで100％の精度を達成し、1調査あたり平均20.9秒。
GPT-OSS-120Bは、調査100％、RCA100％、影響99％の精度を、平均11.6秒で達成。
Llama 3.1 8B Instantは、調査精度79％、RCA精度91.1％、影響精度86.1％を達成し、調査あたり3.9秒、成功時のツール呼び出しのうち21％は誤呼び出し。
小型モデルは幻覚およびプロトコル遵守の問題を示しやすく、巨大モデルは高い忠実性とほぼ完璧な診断性能を示す。
MCPインターフェースは全モデルでツール乱用ゼロを示し、エージェント挙動を強く制約している。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。