QUICK REVIEW

[論文レビュー] When Shared Knowledge Hurts: Spectral Over-Accumulation in Model Merging

Yayuan Li, Ze Peng|arXiv (Cornell University)|Feb 5, 2026

Domain Adaptation and Few-Shot Learning被引用数 0

ひとこと要約

論文は、タスクがスペクトル方向を共有する場合のモデル統合における失敗モードとしてスペクトル過剰計数を特定し、Singular Value Calibration (SVC) を提案します。SVC はデータ不要・訓練不要の後処理手法で、スペクトル空間の特異値を調整して統合モデルを再校正します。

ABSTRACT

Model merging combines multiple fine-tuned models into a single model by adding their weight updates, providing a lightweight alternative to retraining. Existing methods primarily target resolving conflicts between task updates, leaving the failure mode of over-counting shared knowledge unaddressed. We show that when tasks share aligned spectral directions (i.e., overlapping singular vectors), a simple linear combination repeatedly accumulates these directions, inflating the singular values and biasing the merged model toward shared subspaces. To mitigate this issue, we propose Singular Value Calibration (SVC), a training-free and data-free post-processing method that quantifies subspace overlap and rescales inflated singular values to restore a balanced spectrum. Across vision and language benchmarks, SVC consistently improves strong merging baselines and achieves state-of-the-art performance. Furthermore, by modifying only the singular values, SVC improves the performance of Task Arithmetic by 13.0%. Code is available at: https://github.com/lyymuwu/SVC.

研究の動機と目的

ファインチューニングされたタスク更新を統合しても、明らかな整合性があっても性能が低下する理由を動機づけ、分析する。
整列したスペクトル方向が共有知識の過剰計数と最大特異値の膨張を引き起こす仕組みを特徴づける。
データなし・訓練なしで、特異値を校正し統合後のスペクトルバランスを回復する方法を提案する。
SVC がビジョンと言語のベンチマーク全般で最先端の利得を生むことを示す。

提案手法

各タスクを事前学習済みバックボーン W_pre に対する DeltaW_i（タスク行列）として表現する。
ベースの統合手法を用いて DeltaW_merge を得る。
DeltaW_merge の SVD を計算し、共有列空間の基底 U と特異値 sigma を得る。
各サブスペース r について、DeltaW_i を左特異ベクトル u^r に射影して a_r^i を得て射影係数 s_i^r を計算する。
タスク間で s_i^r を集約して校正係数 gamma^r を形成し、対応する特異値を校正する： tilde_sigma^r = gamma^r sigma^r（gamma^r は gamma^r = K / sum_i max(alpha, s_i^r) で導出）。
校正後の統合更新 DeltaW_tilde_merge = sum_r tilde_sigma^r u^r (v^r)^T を再構成し、W_merge = W_pre + DeltaW_tilde_merge を出力する。
本手法はデータフリー・訓練フリーで、統合スペクトル基底の射影とサブスペースごとの校正パラメータ alpha（デフォルト 1/K）を用いる。

Figure 1 : Shared knowledge accumulation in model merging. When merging task matrices ( $\Delta\mathbf{W}_{i}$ ) from multiple tasks, shared knowledge that aligns across tasks can be over-counted, resulting in singular-value inflation in the merged model’s spectrum. This inflation is concentrated in

実験結果

リサーチクエスチョン

RQ1統合時にスペクトル整列があっても悪化が生じる原因は何か。
RQ2スペクトルサブスペース間のクロスタスク整列が統合モデルの特異値膨張にどのように寄与するのか。
RQ3データなしのポストホックな校正でスペクトルバランスを回復し、統合後の性能を改善できるか。
RQ4統合スペクトル基底の特異値を校正するだけで、ビジョンと言語タスク全般で最先端の結果を達成できるか。

主な発見

スペクトル過計数は上位スペクトルサブスペースに集中し、上位特異値を膨張させ、統合モデルを共有方向へ偏らせる。
射影分析により、各タスク方向に沿った統合応答が、同じサブスペースで他のタスクが正の寄与をする場合に過剰に増幅される（s_i^r > 1）。
SVC は射影係数を用いてサブスペースの重なりを定量化し、膨張した特異値を再スケールしてスペクトルを均衡に戻す。
ビジョンのベンチマーク全般で、SVC は Task Arithmetic をその設定で 13.0% 向上させ、他の統合ベースラインにも大きな利得をもたらす。
NLP のベンチマーク全般で、SVC は複数モデル・タスクで最先端の性能を達成し、LLM やエンコーダーベースの設定でも改善を示す。
SVC は方向性を維持しつつ特異値のみを調整するため、データ不要の軽量な後処理ソリューションを提供する。

Figure 2 : Discrepancy between original and calibrated singular values. For weight-space addition, we compare the original singular values $\sigma$ from $\mathrm{SVD}(\Delta\mathbf{W}_{\mathrm{merge}})$ with the calibrated values $\sigma^{\star}$ , where $\sigma^{\star}$ is obtained by first computi

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。