QUICK REVIEW

[論文レビュー] Structured Episodic Event Memory

Zhengxuan Lu, Dongfang Li|arXiv (Cornell University)|Jan 10, 2026

Topic Modeling被引用数 0

ひとこと要約

SEEMはグラフメモリ層と動的エピソードメモリ層を結合した階層的メモリシステムを導入し、LLMベースのエージェントの長期推論を改善。LoCoMoとLongMemEvalでベースラインを上回る。

ABSTRACT

Current approaches to memory in Large Language Models (LLMs) predominantly rely on static Retrieval-Augmented Generation (RAG), which often results in scattered retrieval and fails to capture the structural dependencies required for complex reasoning. For autonomous agents, these passive and flat architectures lack the cognitive organization necessary to model the dynamic and associative nature of long-term interaction. To address this, we propose Structured Episodic Event Memory (SEEM), a hierarchical framework that synergizes a graph memory layer for relational facts with a dynamic episodic memory layer for narrative progression. Grounded in cognitive frame theory, SEEM transforms interaction streams into structured Episodic Event Frames (EEFs) anchored by precise provenance pointers. Furthermore, we introduce an agentic associative fusion and Reverse Provenance Expansion (RPE) mechanism to reconstruct coherent narrative contexts from fragmented evidence. Experimental results on the LoCoMo and LongMemEval benchmarks demonstrate that SEEM significantly outperforms baselines, enabling agents to maintain superior narrative coherence and logical consistency.

研究の動機と目的

LLMベースのエージェントにおける散在検索と長期記憶の弱点に対処する。
静的な関係事実のためのGraph Memory Layer (GML)と、動的な語りの進行のためのEpisodic Memory Layer (EML)を組み合わせた二層メモリシステムを開発する。
ポインタを介して正確な出自にメモリユニットを結び付け、複雑な推論のための文脈を一貫して再構成できるようにする。
LoCoMoとLongMemEvalでSEEMを評価し、メモリ拡張型および密な検索ベースラインと比較する。

提案手法

相互作用ストリームを二層のメモリに変換する：EEFsを用いるEMLと、関係的四つ組を用いるGML。
LLMベースの抽出器でパッセージからEEFを抽出し、出自ポインタでアンカー付けする。関連フレームを連想的統合で結合。
パッセージから関係的四つ組を grounding して類似ノードを統合し、関係グラフを構築する。
Relational PropagationとReverse Provenance Expansionを用いたハイブリッドリトリーバルで、出自リンクを通じて retrieved contextを拡張する。
拡張されたパッセージ、EEFs、関係事実を逐次化してLLMによる条件付き生成のための最終文脈を合成する。
BLEU-1、F1、J（LLM judge）、LongMemEvalの精度などの語彙・意味的指標で評価し、アブレーションとケーススタディを実施する。

実験結果

リサーチクエスチョン

RQ1階層的メモリアーキテクチャは長期対話においてフラットまたは純粋な密検索アプローチより一貫性と事実的一貫性を改善できるか？
RQ2構造化されたエピソードイベントフレームと連想的統合は既存のメモリシステムより語りの進行と時間的推論をより良く保持できるか？
RQ3Reverse Provenance Expansionは文脈の完全性と推論品質にどのように影響するか？
RQ4SEEMの各要素（EEF、RPE、GMLの提供、関係的伝播）の全体性能への寄与はどの程度か？

主な発見

BLEU-1	F1	J	Acc.
KaLM-Embedding-V2.5	44.4	47.9	64.6	55.6
NV-Embed-v2	53.0	57.9	74.7	58.4
Mem0	34.2	43.3	54.1	56.7
A-MEM	45.7	44.6	61.9	55.2
HippoRAG 2	53.8	58.3	76.2	60.6
SEEM (Ours)	56.1	61.1	78.0	65.0

SEEMはLoCoMoとLongMemEvalの両方で語彙的・意味的指標の最高得点を達成。
LoCoMoではSEEMがBLEU-1 56.1、F1 61.1、J 78.0、Acc. 65.0を達成し、HippoRAG 2よりF1で2.8ポイント、Jで1.5ポイント高い。
LongMemEvalではSEEMが65.0%の精度を達成し、HippoRAG 2より4.4ポイント改善。
SEEMは密検索ベースライン（例：NV-Embed-v2）を semantic evaluation（J）および長期的な精度で大幅に上回り、より強い語りの基盤と一貫性を示唆。
アブレーションでは各核心要素（EEF、RPE、Relational Propagation、Fact Provisioning）が性能に寄与しており、いずれかを除くと指標が低下。
時間的推論と対立的推論は、EP(S) Memory Layerと出自の grounding から顕著に恩恵を受ける。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。