QUICK REVIEW

[論文レビュー] PureCC: Pure Learning for Text-to-Image Concept Customization

Zhiwu Liao, Xiaole Xian|arXiv (Cornell University)|Mar 8, 2026

Domain Adaptation and Few-Shot Learning被引用数 0

ひとこと要約

PureCCは、元のモデルの挙動と能力を保持しつつ、テキストから画像生成において個人化された概念を学習するための、分離学習目標とデュアルブランチのパイプラインを導入します。適応的ガイダンスを用いた凍結表現抽出器により、純粋な概念カスタマイズを実現します。

ABSTRACT

Existing concept customization methods have achieved remarkable outcomes in high-fidelity and multi-concept customization. However, they often neglect the influence on the original model's behavior and capabilities when learning new personalized concepts. To address this issue, we propose PureCC. PureCC introduces a novel decoupled learning objective for concept customization, which combines the implicit guidance of the target concept with the original conditional prediction. This separated form enables PureCC to substantially focus on the original model during training. Moreover, based on this objective, PureCC designs a dual-branch training pipeline that includes a frozen extractor providing purified target concept representations as implicit guidance and a trainable flow model producing the original conditional prediction, jointly achieving pure learning for personalized concepts. Furthermore, PureCC introduces a novel adaptive guidance scale $λ^\star$ to dynamically adjust the guidance strength of the target concept, balancing customization fidelity and model preservation. Extensive experiments show that PureCC achieves state-of-the-art performance in preserving the original behavior and capabilities while enabling high-fidelity concept customization. The code is available at https://github.com/lzc-sg/PureCC.

研究の動機と目的

元のモデルの挙動と能力を劣化させずに概念カスタマイズを動機づける。
ファインチューニング時にターゲット概念のガイダンスを元のモデルの予測から切り離す。
凍結抽出器と訓練可能な予測子を持つデュアルブランチ学習パイプラインを開発する。
ターゲット概念をより表現するための層ごとに tunable 埋め込みを導入する。
忠実度と保存のバランスを取るための適応的なガイダンススケールを提案する。

提案手法

カスタムセットでLoRAを用いて微調整された事前学習済みフロー型モデルに基づく表現抽出器を使用する。
各層のプロンプト埋め込みの [V] を置換する層ごとに tunable 概念埋め込みを導入する。
v_t^PureCC = v_t^original + lambda * v_t^target となる分離学習目標を定式化する。
v_t^target を凍結抽出器からの表現バイアス（ターゲットテキストとヌル条件の差）として定義する。
凍結表現抽出器が暗黙のガイダンスを提供し、訓練可能なフロー型モデルが元の条件付き出力を予測するデュアルブランチパイプラインを訓練する。
訓練可能な表現をターゲットガイダンス表現へ射影して忠実度と保存のバランスを取ることで適応的ガイダンススケール lambda* を計算する。

実験結果

リサーチクエスチョン

RQ1概念カスタマイズを、元のモデルの挙動に最小限の影響でどのように学習できるか？
RQ2分離された目的関数とデュアルブランチアーキテクチャは、ターゲット概念の純粋な学習を可能にするか？
RQ3個別化と保存のバランスを取るために、訓練中のガイダンス強度をどのように適応させるべきか？
RQ4層ごとの概念埋め込みは、ターゲット概念の表現とその後のファインチューニングを改善するか？
RQ5PureCCは、忠実度とモデル保存の観点で、インスタンスとスタイルの概念カスタマイズにどのような影響を与えるか？

主な発見

PureCCは、元のモデルの挙動を最先端で維持しつつ、高忠実度の概念カスタマイズを可能にする。
凍結抽出器と訓練可能な予測子を備えたデュアルブランチ構成は、元の機能を保持し、効果的な暗黙のガイダンスを提供する。
適応的ガイダンススケール lambda* は、概念忠実度とモデル保存をバランスさせ、固定スケールアプローチを上回る。
層ごとに tunable 埋め込みを持つ表現抽出器は、ターゲット概念の表現をより豊かにする。
PureCCは、スタイル-インスタンス混合を含む単一概念および複数概念のカスタマイズにおいて高い性能を示す。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。