[论文解读] Multiplicative Orthogonal Sequential Editing for Language Models
MOSE 引入正交、乘法编辑范式,在更新知识时保持数值稳定性,同时提升序列编辑性能并保持下游任务的通用能力。
Knowledge editing aims to efficiently modify the internal knowledge of large language models (LLMs) without compromising their other capabilities. The prevailing editing paradigm, which appends an update matrix to the original parameter matrix, has been shown by some studies to damage key numerical stability indicators (such as condition number and norm), thereby reducing editing performance and general abilities, especially in sequential editing scenario. Although subsequent methods have made some improvements, they remain within the additive framework and have not fundamentally addressed this limitation. To solve this problem, we analyze it from both statistical and mathematical perspectives and conclude that multiplying the original matrix by an orthogonal matrix does not change the numerical stability of the matrix. Inspired by this, different from the previous additive editing paradigm, a multiplicative editing paradigm termed Multiplicative Orthogonal Sequential Editing (MOSE) is proposed. Specifically, we first derive the matrix update in the multiplicative form, the new knowledge is then incorporated into an orthogonal matrix, which is multiplied by the original parameter matrix. In this way, the numerical stability of the edited matrix is unchanged, thereby maintaining editing performance and general abilities. We compared MOSE with several current knowledge editing methods, systematically evaluating their impact on both editing performance and the general abilities across three different LLMs. Experimental results show that MOSE effectively limits deviations in the edited parameter matrix and maintains its numerical stability. Compared to current methods, MOSE achieves a 12.08% improvement in sequential editing performance, while retaining 95.73% of general abilities across downstream tasks. The code is available at https://github.com/famoustourist/MOSE.
研究动机与目标
- 解决逐步更新中加法知识编辑的不稳定性。
- 提出使用正交变换的乘法编辑框架,以保持范数与条件数。
- 证明 MOSE 在保持编辑性能的同时,能在下游任务中保留通用能力。
- 在多种 LLM 与编辑数据集上对比现有方法评估 MOSE 的效果。
提出的方法
- 用一个正交更新矩阵对原参数矩阵进行左乘,替代加法更新。
- 将更新问题表述为受约束的最小二乘(Orthogonal Procrustes)问题,以寻找最优正交变换。
- 通过正则化目标函数,最小化保持原有知识与拟合新知识的折中。
- 利用层激活性为准则选择编辑层,并将编辑扩展至相邻层以提升性能。
- 提供解析证明:正交矩阵的左乘保持 Frobenius 范数与矩阵条件数。
实验结果
研究问题
- RQ1正交、乘法更新是否能在序列编辑过程中保持数值稳定性?
- RQ2在序列与批量编辑下,MOSE 的编辑性能与通用能力是否优于加法方法?
- RQ3应如何选择编辑层以最大化 MOSE 在知识更新中的有效性?
主要发现
| 方法 | 模型 | CounterFact 可靠性 | CounterFact 泛化 | CounterFact 局部性 | ConceptEdit-Inter 可靠性 | ConceptEdit-Inter 泛化 | ConceptEdit-Inter 局部性 |
|---|---|---|---|---|---|---|---|
| ROME | LLama3-8B | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 |
| MEMIT | LLama3-8B | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 |
| RECT | LLama3-8B | 0.5266 | 0.3075 | 0.2382 | 0.3234 | 0.1993 | 0.1397 |
| EMMET | LLama3-8B | 0.6287 | 0.4695 | 0.3114 | 0.3866 | 0.2178 | 0.1563 |
| PRUNE | LLama3-8B | 0.7738 | 0.6899 | 0.5190 | 0.5682 | 0.4097 | 0.3083 |
| AlphaEdit | LLama3-8B | 0.8222 | 0.7835 | 0.7091 | 0.6981 | 0.5928 | 0.4977 |
| MOSE | LLama3-8B | 0.9887 | 0.9863 | 0.8972 | 0.7859 | 0.7275 | 0.6856 |
| ROME | Qwen2.5-7B | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 |
| MEMIT | Qwen2.5-7B | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 |
| RECT | Qwen2.5-7B | 0.6203 | 0.4745 | 0.3582 | 0.3737 | 0.2306 | 0.1738 |
| EMMET | Qwen2.5-7B | 0.6702 | 0.5589 | 0.4771 | 0.4593 | 0.2641 | 0.1903 |
| PRUNE | Qwen2.5-7B | 0.8115 | 0.7860 | 0.6823 | 0.6708 | 0.5009 | 0.4120 |
| AlphaEdit | Qwen2.5-7B | 0.9519 | 0.9241 | 0.8418 | 0.7346 | 0.6453 | 0.6116 |
| MOSE | Qwen2.5-7B | 0.9981 | 0.9902 | 0.9098 | 0.8012 | 0.7547 | 0.7069 |
- MOSE 在连续编辑过程中保持了被编辑参数矩阵的数值稳定性(范数与条件数)。
- MOSE 在序列编辑性能上相较基线加法方法提升了 12.08%。
- MOSE 在各项实验中保留了模型对下游任务的通用能力的 95.73%。
- 在批量-序列编辑中,MOSE 持续优于基线并且具有更好的扩展性。
- 对层感知的 MOSE(编辑选定层及其邻居)在序列与批量设置中均取得最佳结果。
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。