QUICK REVIEW

[論文レビュー] Towards Coherent Image Inpainting Using Denoising Diffusion Implicit Models

Guanhua Zhang, Jiabao Ji|arXiv (Cornell University)|Apr 6, 2023

Generative Adversarial Networks and Image Synthesis被引用数 17

ひとこと要約

CoPaint-TT は、拡散ベースのインペインティング中に露出済み領域と未露出領域の両方を一貫して更新するベイズ誘導のインペインティング手法を提示し、参照画像との不一致を減らし、一貫性と品質を向上させる。

ABSTRACT

Image inpainting refers to the task of generating a complete, natural image based on a partially revealed reference image. Recently, many research interests have been focused on addressing this problem using fixed diffusion models. These approaches typically directly replace the revealed region of the intermediate or final generated images with that of the reference image or its variants. However, since the unrevealed regions are not directly modified to match the context, it results in incoherence between revealed and unrevealed regions. To address the incoherence problem, a small number of methods introduce a rigorous Bayesian framework, but they tend to introduce mismatches between the generated and the reference images due to the approximation errors in computing the posterior distributions. In this paper, we propose COPAINT, which can coherently inpaint the whole image without introducing mismatches. COPAINT also uses the Bayesian framework to jointly modify both revealed and unrevealed regions, but approximates the posterior distribution in a way that allows the errors to gradually drop to zero throughout the denoising steps, thus strongly penalizing any mismatches with the reference image. Our experiments verify that COPAINT can outperform the existing diffusion-based methods under both objective and subjective metrics. The codes are available at https://github.com/UCSB-NLP-Chang/CoPaint/.

研究の動機と目的

露出済みの領域と未露出の領域の不一致を避けるために、拡散モデルを用いた一貫性のある画像インペインティングを動機づける。
拡散過程の間、すべての画像領域を同時に更新して不一致を生じさせないベイズフレームワークを提案する。
デノイジングステップを通じてインペインティング誤差を最小化する、計算的に実現可能なアルゴリズム（CoPaint および CoPaint-TT）を開発する。
CelebA-HQ および ImageNet において、既存の拡散ベースのインペインティング手法よりも一貫性と品質の改善を実証する。
品質と効率のバランスを取るための実用的な変種と分析（time travel を含む）を提供する。

提案手法

固定された事前学習済み拡散モデルを採用し、露出領域が参照と一致するという制約の下で後方サンプリングとしてインペインティングを最定式化する。
一歩生成値を中心とするガウス尤度を介してインペインティング制約を適用する近似後方を導出し、扱いやすい最適化を可能にする。
最終生成を近似するための one-step generation f_theta^(t)(X_t) を導入し、計算量を削減する。
先着順に訂正されるデノイジング（CoPaint）アルゴリズムを説明し、インペインティング制約を満たすように X_T を最適化し、事前分布で正則化する。
多段階近似や time travel を含む追加設計（CoPaint-TT）でデノイジング中に近似誤差を段階的に低減させ、精度を高める。
近似後方からの貪欲サンプリング手順を提供し、最終的に X_0 がインペインティング制約を満たし、末尾で誤差が消失するようにする。

Figure 2: The trajectory of the gap between $\bm{f}_{\theta}^{(t)}(\tilde{\bm{X}}_{t})$ and $\tilde{\bm{X}}_{0}$ along the unconditional diffusion denoising process. We report the pixel-wise averaged Euclidean distance between the two.

実験結果

リサーチクエスチョン

RQ1ディフュージョンベースのインペインティング中に、露出領域と未露出領域の両方を一貫して変更し、ミスマッチを生じさせずにベイズフレームワークを用いることは可能か。
RQ2インペインティング制約を強制する際に後方サンプリングをどのように扱いやすくするか、また one-step generation は最終出力の制御にどう寄与するか。
RQ3CoPaint および CoPaint-TT のような一貫性重視の手法は、標準データセットで既存の拡散ベースのインペインティングのベースラインより優れているか。
RQ4追加設計（多段階近似、time travel）がインペインティングの品質と効率に与える影響は何か。
RQ5従来手法と比較して計算量を削減しつつ、インペインティングの品質を維持または向上させることは可能か。

主な発見

CoPaint およびその変種 CoPaint-TT は、CelebA-HQ および ImageNet において、複数の拡散ベースのベースラインよりもインペインティングの品質と一貫性を向上させる。
CoPaint-TT は、評価データセットにおいて RePaint と比較して平均 LPIPS が顕著に減少（相対約 19%）し、ImageNet で計算予算の削減が報告されている。
one-step generation アプローチは、デノイジングが進むにつれてインペインティング制約を段階的に厳密化する扱いやすい近似を可能にし、理想的な設定で最終ステップで近似誤差を0にできる。
time travel と多段階近似を追加することで、初期ステップの近似誤差をさらに減らし、自己一貫性とサンプル品質を改善できる。
実験を通じて、CoPaint のバリエーションはベースラインと比較して主観的な人間評価で競争力を示し、CoPaint-TT は一貫性重視の評価で好ましい結果を得ている。

Figure 3: Time-performance trade-off on CelebA-HQ ( left ) and ImageNet ( right ). The x-axis indicates the average time ( $\downarrow$ ) to process one image, and the y-axis is the average LPIPS ( $\downarrow$ ).

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。