QUICK REVIEW

[論文レビュー] Singular Value Fine-tuning: Few-shot Segmentation requires Few-parameters Fine-tuning

Yanpeng Sun, Qiang Chen|arXiv (Cornell University)|Jun 13, 2022

Domain Adaptation and Few-Shot Learning被引用数 26

ひとこと要約

SVFはバックボーンの特異値だけを微調整する（SVD経由）少数ショット分割で、一般化を改善し Pascal-5i と COCO-20i で最先端の結果を達成。

ABSTRACT

Freezing the pre-trained backbone has become a standard paradigm to avoid overfitting in few-shot segmentation. In this paper, we rethink the paradigm and explore a new regime: {\em fine-tuning a small part of parameters in the backbone}. We present a solution to overcome the overfitting problem, leading to better model generalization on learning novel classes. Our method decomposes backbone parameters into three successive matrices via the Singular Value Decomposition (SVD), then {\em only fine-tunes the singular values} and keeps others frozen. The above design allows the model to adjust feature representations on novel classes while maintaining semantic clues within the pre-trained backbone. We evaluate our {\em Singular Value Fine-tuning (SVF)} approach on various few-shot segmentation methods with different backbones. We achieve state-of-the-art results on both Pascal-5$^i$ and COCO-20$^i$ across 1-shot and 5-shot settings. Hopefully, this simple baseline will encourage researchers to rethink the role of backbone fine-tuning in few-shot settings. The source code and models will be available at https://github.com/syp2ysy/SVF.

研究の動機と目的

少数ショット分割における標準のバックボーン凍結パラダイムを再考し、新しいクラスへの汎化を改善する。
事前学習済みの手掛かりを破壊せず、意味表現を調整する軽量な微調整手法を提案する。
特異値を微調整することで、複数のバックボーンとFSS手法でより良い性能を示す。

提案手法

事前学習済みバックボーンの重みを Singular Value Decomposition (SVD) によって3つの行列に分解する。
UとVの成分を凍結し、特異値 S のみを微調整する（SVF）。
Sを意味的手掛かりの再重み付けとして解釈し、意味的手掛かりを保持しつつ特徴表現を適応させる。
Sを凍結部品と学習可能部品の積として表現する（S = S_frozen · S_trainable）微調整機構を地固めする。
SVFを畳み込み層に適用し、実質的には3段階の操作を再構成する：（i）サブ空間を縮小した3x3畳み込み、（ii）Sによるスケーリング、（iii）1x1畳み込みで投影を戻し、パラメータ効率の良い微調整空間を作成。
BNパラメータを凍結して劣化を避け、PFENetとBAMのベースライン間でSVFを全微調整および層/畳み込み微調整と比較する。

実験結果

リサーチクエスチョン

RQ1SVFによるバックボーンのパラメータのごく一部を微調整することで、少数ショット分割におけるバックボーン凍結よりも性能が上回るか？
RQ2Pascal-5iとCOCO-20iの1ショットおよび5ショットタスクで、SVFは汎化にどう影響するか？
RQ3どのバックボーン層とサブ空間（U, S, V）がSVFの性能向上に最も寄与するか？

主な発見

SVFはPascal-5iとCOCO-20iにおいて、1ショットおよび5ショット設定で複数のベースライン（PFENet、BAM）およびバックボーン（VGG-16、ResNet-50）で最先端の性能を達成・改善。
特異値Sのみを微調整すると一般化が向上し、全バックボーンの微調整や部分層微調整と比べ過学習を回避できる。
Sサブスペース（特に3層目と4層）を微調整すると最大の改善が得られ、UまたはVだけを調整すると性能が劣化する可能性がある。
BN層はSVF使用時には凍結すべきで、安定性と性能を維持。
SVFはアダプターやバイアス調整などのパラメータ効率の良い微調整手法より一貫して上回る。
視覚的分析はSVFが前景手掛かりへ重心を移し、ノイズの多い背景への依存を減らすことを示す。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。