QUICK REVIEW

[論文レビュー] DiffIR: Efficient Diffusion Model for Image Restoration

Bin Xia, Yulun Zhang|arXiv (Cornell University)|Mar 16, 2023

Image and Signal Denoising Methods被引用数 16

ひとこと要約

DiffIR はコンパクトな IR prior 表現と二段階の学習方式を用いて拡散モデルを画像修復へ適用し、 prior DM ベースの IR 手法よりはるかに少ない反復と低い計算で最先端の結果を達成します。

ABSTRACT

Diffusion model (DM) has achieved SOTA performance by modeling the image synthesis process into a sequential application of a denoising network. However, different from image synthesis, image restoration (IR) has a strong constraint to generate results in accordance with ground-truth. Thus, for IR, traditional DMs running massive iterations on a large model to estimate whole images or feature maps is inefficient. To address this issue, we propose an efficient DM for IR (DiffIR), which consists of a compact IR prior extraction network (CPEN), dynamic IR transformer (DIRformer), and denoising network. Specifically, DiffIR has two training stages: pretraining and training DM. In pretraining, we input ground-truth images into CPEN$_{S1}$ to capture a compact IR prior representation (IPR) to guide DIRformer. In the second stage, we train the DM to directly estimate the same IRP as pretrained CPEN$_{S1}$ only using LQ images. We observe that since the IPR is only a compact vector, DiffIR can use fewer iterations than traditional DM to obtain accurate estimations and generate more stable and realistic results. Since the iterations are few, our DiffIR can adopt a joint optimization of CPEN$_{S2}$, DIRformer, and denoising network, which can further reduce the estimation error influence. We conduct extensive experiments on several IR tasks and achieve SOTA performance while consuming less computational costs. Code is available at \url{https://github.com/Zj-BinXia/DiffIR}.

研究の動機と目的

画像修復でほとんどの入力ピクセルが与えられている場合の拡散モデルの効率的な利用を動機づける。
修復を導くコンパクトな IR prior 表現（IPR）を開発する。
IR のために CPEN と拡散を活用する二段階の学習方式を提案する。
CPENS2、DIRformer、ノイズ除去ネットワークの共同最適化を可能にして推定誤差を減らす。

提案手法

ground-truth 画像からコンパクトな IR prior 表現を抽出するために CPEN を導入する。
IPR を修復に活用する Dynamic IRformer（DIRformer）を DMTA と DGFN で提案する。
Stage 1 では reconstruction 損失で CPEN S1 と DIRformer を共同最適化して訓練する。
Stage 2 では低品質画像から IPR を推定するための拡散モデルを訓練し、コンパクト潜在ベクトルと共同最適化を用いる。
CPEN S2 と拡散フレームワーク内のノイズ除去ネットワークを用いて IPR を反復的に洗練させ、画像を復元する。

Figure 1: The Mult-Adds are measured on 256 $\times$ 256 inputs. Our DiffIR achieves SOTA performance on IR tasks. Notably, LDM [ 50 ] and RePaint [ 40 ] are DM-based methods, and DiffIR is 1000 $\times$ more efficient than RePaint while achieving better performance.

実験結果

リサーチクエスチョン

RQ1拡散モデルは IR タスクにおいてフル画像ではなくコンパクトな IR prior ベクター上で効果的に動作できるか？
RQ2二段階学習（真実に基づく指導と低品質な指導）は修復品質と安定性を改善するか？
RQ3CPEN S2、DIRformer、ノイズ除去ネットワークの共同最適化は誤差伝搬とアーティファクトを減らすか？
RQ4DiffIR はインペインティング、超解像、運動ブレの除去で最先端の DM ベースの IR 手法と比較してどのように性能を示すか？

主な発見

DiffIR は複数の IR タスクで最先端の性能を達成しつつ、かなり少ない反復と低い計算資源で済む。
CPEN によって導かれたコンパクトな IPR は軽量な DIRformer で効果的な修復を可能にする。
CPEN S2、DIRformer、ノイズ除去ネットワークの共同最適化は推定誤差の影響を修復品質に及ぼすのを緩和する。
実験では DiffIR は RePaint や LDM よりもはるかに効率的で、インペインティング、SR、ブレ補正のいくつかの DM ベースのベースラインよりも上回る。
アブレーション研究は DiffIR S2 設計、共同学習スキーム、逆DM の分散ノイズを回避して IPR 推定の利益を示す。

Figure 2: The overview of the proposed DiffIR, which consists of DIRformer, CPEN, and denoising network. DiffIR has two training stages: (a) In the first stage, CPEN S1 takes the ground-truth image as input and outputs an IPR $\mathbf{Z}$ to guide DIRformer to restore images. We optimize the CPEN S1

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。