QUICK REVIEW

[论文解读] Towards Coherent Image Inpainting Using Denoising Diffusion Implicit Models

Guanhua Zhang, Jiabao Ji|arXiv (Cornell University)|Apr 6, 2023

Generative Adversarial Networks and Image Synthesis被引用 17

一句话总结

CoPaint-TT 提出了一种贝叶斯引导的修复方法，在扩散式修复中对已揭示区域和未揭示区域进行一致更新，减少与参考图像的不匹配，并提高连贯性与质量。

ABSTRACT

Image inpainting refers to the task of generating a complete, natural image based on a partially revealed reference image. Recently, many research interests have been focused on addressing this problem using fixed diffusion models. These approaches typically directly replace the revealed region of the intermediate or final generated images with that of the reference image or its variants. However, since the unrevealed regions are not directly modified to match the context, it results in incoherence between revealed and unrevealed regions. To address the incoherence problem, a small number of methods introduce a rigorous Bayesian framework, but they tend to introduce mismatches between the generated and the reference images due to the approximation errors in computing the posterior distributions. In this paper, we propose COPAINT, which can coherently inpaint the whole image without introducing mismatches. COPAINT also uses the Bayesian framework to jointly modify both revealed and unrevealed regions, but approximates the posterior distribution in a way that allows the errors to gradually drop to zero throughout the denoising steps, thus strongly penalizing any mismatches with the reference image. Our experiments verify that COPAINT can outperform the existing diffusion-based methods under both objective and subjective metrics. The codes are available at https://github.com/UCSB-NLP-Chang/CoPaint/.

研究动机与目标

以扩散模型为基础推动一致的图像修复，避免已揭示区域与未揭示区域之间的不一致。
提出一个贝叶斯框架，在扩散过程中联合更新所有图像区域而不产生不匹配。
开发一个计算上可行的算法（CoPaint 和 CoPaint-TT），通过去噪步骤来最小化修复误差。
在 CelebA-HQ 和 ImageNet 上展示相较于现有基于扩散的修复方法的连贯性和质量提升。
提供实用的变体与分析（包括 time travel），以在质量和效率之间取得平衡。

提出的方法

采用固定的预训练扩散模型，并将修复表述为在揭示区域与参考图像匹配的约束下的后验采样。
推导一个近似后验，其中通过以一步生成值为中心的高斯似然来强制执行修复约束，从而实现可处理的优化。
引入一步生成 f_theta^(t)(X_t) 来近似最终生成并减少计算量。
描述一个逐次纠正的去噪（CoPaint）算法，在优化 X_T 时使其满足修复约束，同时通过先验进行正则化。
通过添加多步近似和 time travel（CoPaint-TT）等设计来提高准确性，在去噪过程中逐步降低近似误差。
提供一个贪婪采样过程，从近似后验中获得最终的 X_0，使其在最后达到无误差的修复约束。

Figure 2: The trajectory of the gap between $\bm{f}_{\theta}^{(t)}(\tilde{\bm{X}}_{t})$ and $\tilde{\bm{X}}_{0}$ along the unconditional diffusion denoising process. We report the pixel-wise averaged Euclidean distance between the two.

实验结果

研究问题

RQ1是否可以在扩散式修复中使用贝叶斯框架来一致地修改已揭示和未揭示区域，而不产生不匹配？
RQ2在执行修复约束时，如何使后验采样变得可处理，以及一步生成如何帮助控制最终输出？
RQ3以连贯性为焦点的方法（如 CoPaint 和 CoPaint-TT）是否在标准数据集上超越现有基线？
RQ4附加设计（多步近似、time travel）对修复质量和效率的影响是什么？
RQ5是否有可能在比前人方法更低的计算量下保留或提升修复质量？

主要发现

CoPaint 及其变体 CoPaint-TT 在 CelebA-HQ 和 ImageNet 上实现了比若干基线更好的修复质量和连贯性。
与 RePaint 比较，CoPaint-TT 在评测数据集上显示显著的平均 LPIPS 降幅（相对约 19%），并在 ImageNet 上报告了计算预算的降低。
一步生成方法实现了可处理的近似，该近似在去噪过程中逐步收紧修复约束，在理想条件下最后一步可实现零近似误差。
加入 time travel 和多步近似可进一步降低早期步骤的近似误差，并改善自一致性与样本质量。
在各项实验中，CoPaint 的变体在主观人工评价中与基线相比具有竞争力，CoPaint-TT 在面向连贯性的评估中取得有利结果。

Figure 3: Time-performance trade-off on CelebA-HQ ( left ) and ImageNet ( right ). The x-axis indicates the average time ( $\downarrow$ ) to process one image, and the y-axis is the average LPIPS ( $\downarrow$ ).

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。