QUICK REVIEW

[论文解读] From Lossy to Verified: A Provenance-Aware Tiered Memory for Agents

Qiming Zhu, Shunian Chen|arXiv (Cornell University)|Feb 20, 2026

Scientific Computing and Data Management被引用 2

一句话总结

tldr: TierMem introduces a provenance-linked two-tier memory for long-horizon agents that defaults to fast summaries but escalates to immutable raw logs when needed, then writes back verified findings to improve future efficiency.

ABSTRACT

Long-horizon agents often compress interaction histories into write-time summaries. This creates a fundamental write-before-query barrier: compression decisions are made before the system knows what a future query will hinge on. As a result, summaries can cause unverifiable omissions -- decisive constraints (e.g., allergies) may be dropped, leaving the agent unable to justify an answer with traceable evidence. Retaining raw logs restores an authoritative source of truth, but grounding on raw logs by default is expensive: many queries are answerable from summaries, yet raw grounding still requires processing far longer contexts, inflating token consumption and latency. We propose TierMem, a provenance-linked framework that casts retrieval as an inference-time evidence allocation problem. TierMem uses a two-tier memory hierarchy to answer with the cheapest sufficient evidence: it queries a fast summary index by default, and a runtime sufficiency router Escalates to an immutable raw-log store only when summary evidence is insufficient. TierMem then writes back verified findings as new summary units linked to their raw sources. On LoCoMo, TierMem achieves 0.851 accuracy (vs.0.873 raw-only) while reducing input tokens by 54.1\% and latency by 60.7%.

研究动机与目标

Identify the write-before-query barrier in long-horizon agent memory and its impact on verifiability.
Propose a two-tier memory (summary + immutable raw log) with provenance links to enable selective escalation.
Develop a lightweight router to decide when summaries suffice vs. escalation to raw logs.
Enable online consolidation by writing back verified findings to the summary tier to amortize raw-access costs.

提出的方法

Two-tier memory: Tier-1 provenance-linked summaries and Tier-2 immutable raw logs with stable page IDs.
Inference-time router πθ that decides Answer (summary) vs Escalate (ground in Tier-2) for each query.
Provenance-guided escalation prioritizes linked Tier-2 pages; bounded multi-hop raw retrieval when needed.
Verified write-back writes grounded findings back to Tier-1 with provenance to maintain traceability.
Training the router via supervised signals from an oracle (summary-only vs raw-grounded) and cost-aware alignment (GRPO).
Evaluation on LoCoMo and LongMemEval benchmarks to measure accuracy, efficiency, and Omission rates.

实验结果

研究问题

RQ1Does TierMem improve the accuracy–efficiency trade-off relative to summary-only and raw-only baselines on long-horizon memory benchmarks?
RQ2Can a lightweight router reliably detect evidence insufficiency with negligible overhead?
RQ3Do provenance pointers improve grounding quality of escalated queries?
RQ4Does online consolidation amortize raw-access costs over time by pushing verified findings into Tier-1?

主要发现

On LoCoMo, TierMem router achieves 0.851 accuracy versus 0.873 for raw-only while reducing input tokens by 54.1% and latency by 60.7%.
Summary-only methods exhibit notable unverifiable omission rates (UOR 14.7%–23.3% on LoCoMo).
On LongMemEval, TierMem mitigates summary loss by routing insufficient cases to raw grounding, sustaining better accuracy than summary-only baselines.
Linked provenance pointers yield higher accuracy for escalated queries (Linked 85.1% vs No-Linked 83.6% in their ablation).
Consolidation via online write-back increases cheap-path coverage over replay epochs, reducing average tokens and latency for later queries.

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。