[论文解读] Spectral Characterization and Mitigation of Sequential Knowledge Editing Collapse
本论文通过权重矩阵的谱特性分析了顺序知识编辑为何会导致模型通用能力崩溃,并提出 REVIVE,一种即插即用的方法,在更新过程中保留主导谱子子空间,以提升长序列编辑性能。
Sequential knowledge editing in large language models often causes catastrophic collapse of the model's general abilities, especially for parameter-modifying methods. Existing approaches mitigate this issue through heuristic constraints on parameter updates, yet the mechanisms underlying such degradation remain insufficiently understood. In this work, we present a spectral analysis of sequential knowledge editing and show that a model's general abilities are closely associated with dominant singular directions of pretrained weight matrices. These directions are highly sensitive to perturbations and are progressively disrupted by repeated edits, closely tracking the collapse in both editing efficacy and general performance. Building on this insight, we propose REVIVE, a plug-and-play framework that stabilizes sequential editing by explicitly preserving the dominant singular subspace. REVIVE represents parameter updates in the spectral basis of the original weights and filters components that would interfere with the protected region. Extensive experiments across multiple models and benchmarks show that REVIVE consistently improves editing efficacy while substantially preserving general abilities under long-horizon sequential editing, including extreme settings with up to 20,000 edits.
研究动机与目标
- 通过谱特性 Identify/Identify? 进行研究目标文本应为:Identify how sequential edits affect a model’s general abilities through spectral properties of pretrained weight matrices.
- Demonstrate that dominant singular directions are crucial for general abilities and are fragile under perturbations.
- Develop a plug-and-play framework (REVIVE) to preserve dominant spectral structure during edits.
- Evaluate REVIVE across multiple models and long-horizon editing benchmarks to show improved editing efficacy and preserved general abilities.
提出的方法
- 将参数更新表示为原始权重矩阵的奇向量基底,以将编辑分解为谱分量(Eq. 4)。
- 使用能量阈值 τ 确定主导奇异子空间,并构建安全更新以去除干扰该区域的分量(Eq. 5 与 Eq. 6)。
- 将 REVIVE 作为即插即用包装器,在允许在低能量方向进行编辑的同时保留主导子空间。
- 使用谱度量(Low-rank Subspace Similarity 和 Singular Vector Similarity)监控在顺序编辑中的主导子空间漂移。
- 在 LLaMA3、GPT-J、GPT-2-XL 上使用 COUNTERFACT 和 ZSRE 基准进行评估,并与 MEMIT、PRUNE、RECT、ALPHAEDIT、DELTAEDIT、NSE 进行对比。
实验结果
研究问题
- RQ1 pretrained weight matrices 哪些谱结构对通用能力影响最大?
- RQ2顺序编辑如何扰动这些谱结构,以及这与性能崩溃有何关系?
- RQ3在编辑过程中保留主导奇异子空间是否能在不牺牲编辑效果的前提下稳定长序列的顺序编辑?
主要发现
- 通用能力在前 5% 的奇异分量中高度集中,仅凭这些分量就能恢复约 62.6% 的性能。
- 主导谱方向对扰动极为敏感,受到扰动时会快速退化,与性能崩溃相关。
- 顺序编辑逐步改变主导奇异子空间,像 LS 与 SS 这样的度量显示主导方向的宏观漂移和微观旋转。
- REVIVE 在不同模型与基准上持续提升编辑效果,且在长编辑序列(最高达 20,000 次编辑)下显著保留 GLUE 通用能力。
- REVIVE 降低了在长编辑序列中的异常权重范数增长,表明稳定性得到改善。
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。