[论文解读] RGI: robust GAN-inversion for mask-free image inpainting and unsupervised pixel-wise anomaly detection
RGI 提供一个鲁棒的 GAN 反演框架,能够在不预先知道掩码的情况下同时恢复干净图像并识别损坏区域,具有理论保证,并且有一个放宽的变体(R-RGI),还对生成器进行微调以缩小 GAN 与真实图像流形之间的差距。
Generative adversarial networks (GANs), trained on a large-scale image dataset, can be a good approximator of the natural image manifold. GAN-inversion, using a pre-trained generator as a deep generative prior, is a promising tool for image restoration under corruptions. However, the performance of GAN-inversion can be limited by a lack of robustness to unknown gross corruptions, i.e., the restored image might easily deviate from the ground truth. In this paper, we propose a Robust GAN-inversion (RGI) method with a provable robustness guarantee to achieve image restoration under unknown extit{gross} corruptions, where a small fraction of pixels are completely corrupted. Under mild assumptions, we show that the restored image and the identified corrupted region mask converge asymptotically to the ground truth. Moreover, we extend RGI to Relaxed-RGI (R-RGI) for generator fine-tuning to mitigate the gap between the GAN learned manifold and the true image manifold while avoiding trivial overfitting to the corrupted input image, which further improves the image restoration and corrupted region mask identification performance. The proposed RGI/R-RGI method unifies two important applications with state-of-the-art (SOTA) performance: (i) mask-free semantic inpainting, where the corruptions are unknown missing regions, the restored background can be used to restore the missing content; (ii) unsupervised pixel-wise anomaly detection, where the corruptions are unknown anomalous regions, the retrieved mask can be used as the anomalous region's segmentation mask.
研究动机与目标
- 在未知大范围损坏下,激发标准 GAN 反演的鲁棒性差距。
- 提出 RGI,在没有先验掩码的情况下恢复干净图像并识别损坏区域。
- 提供恢复图像与掩码的渐近收敛理论保证。
- 扩展到 R-RGI,以微调生成器并降低 GAN 的近似差距。
- 在无掩码的语义修复和像素级异常检测方面展示最先进性能。
提出的方法
- 对潜在变量 z 与稀疏掩码 M 进行联合优化,目标为 L_rec((1−M)⊙x, (1−M)⊙G(z)) + λ||M||_1。
- 证明渐近收敛:ẑ(λ) → z* 当 λ ↓ 0(定理 1)。
- 证明渐近掩码收敛:M̂(λ) → M* 当 λ ↓ 0(定理 2)。
- 通过同时优化生成器参数 θ 引入 Relaxed-RGI(R-RGI),以降低 GAN-manifold 间隙(方程 4)。
- 讨论与鲁棒统计和鲁棒机器学习(M-estimators、Winsorizing)的联系,并将其与先前的 GAN 反演方法关联起来。
- 在一个统一框架内展示无掩码的语义修复和像素级异常检测的效果。
实验结果
研究问题
- RQ1RKI 能否在没有预设掩码的情况下恢复干净图像并识别损坏区域?
- RQ2在温和假设和合适的 λ 下,恢复的图像和掩码是否收敛到真实值?
- RQ3通过 R-RGI 放宽是否进一步改善恢复质量,缩小 GAN 的近似差距?
- RQ4该方法是否能够以最先进的性能统一无掩码的语义修复和像素级异常检测?
- RQ5在未知损坏条件下,将优化后的掩码与真实损坏区域之间的理论保证是什么?
主要发现
| 数据集 | 用例 | 指标 | 方法 | Yeh et al. w/o mask | Yeh et al. w/ mask | RGI | Pan et al. w/ mask | R-RGI |
|---|---|---|---|---|---|---|---|---|
| CelebA | Case (i) PSNR | PSNR ↑ | [Yeh w/o mask] | 11.50 | 20.82 | 19.70 | 21.74 | 20.05 |
| CelebA | Case (i) SSIM | SSIM ↑ | [Yeh w/o mask] | 0.358 | 0.492 | 0.451 | 0.570 | 0.509 |
| CelebA | Case (ii) PSNR | PSNR ↑ | [Yeh w/o mask] | 19.64 | 22.63 | 21.52 | 27.63 | 23.73 |
| CelebA | Case (ii) SSIM | SSIM ↑ | [Yeh w/o mask] | 0.440 | 0.536 | 0.490 | 0.766 | 0.655 |
| Cars | Case (i) PSNR | PSNR ↑ | [Yeh w/o mask] | 16.57 | 17.50 | 16.89 | 20.98 | 19.31 |
| Cars | Case (i) SSIM | SSIM ↑ | [Yeh w/o mask] | 0.359 | 0.377 | 0.363 | 0.636 | 0.618 |
| Cars | Case (ii) PSNR | PSNR ↑ | [Yeh w/o mask] | 17.36 | 17.71 | 17.52 | 21.61 | 21.18 |
| Cars | Case (ii) SSIM | SSIM ↑ | [Yeh w/o mask] | 0.361 | 0.382 | 0.363 | 0.650 | 0.588 |
| LSUN bedroom | Case (i) PSNR | PSNR ↑ | [Yeh w/o mask] | 16.15 | 19.27 | 17.67 | 21.36 | 18.72 |
| LSUN bedroom | Case (i) SSIM | SSIM ↑ | [Yeh w/o mask] | 0.405 | 0.428 | 0.416 | 0.587 | 0.567 |
| LSUN bedroom | Case (ii) PSNR | PSNR ↑ | [Yeh w/o mask] | 19.26 | 19.66 | 19.72 | 22.30 | 22.29 |
| LSUN bedroom | Case (ii) SSIM | SSIM ↑ | [Yeh w/o mask] | 0.419 | 0.433 | 0.420 | 0.599 | 0.557 |
- RGI 对未知大范围损坏具有鲁棒性,且恢复图像对真实背景在 λ→0 时具有渐近收敛。
- 在极小 λ 的极限下,识别出的掩码收敛到真实损坏区域掩码,在温和条件下能够实现精确的掩码恢复。
- R-RGI 通过对生成器进行微调进一步改进恢复,缩小学习到的流形与真实图像流形之间的差距并提升性能。
- 在无掩码的语义修复方面,RGI 可达到或超越带掩码的基线,而无需预设掩码;R-RGI 的性能接近带掩码的调优方法。
- 在无监督的像素级异常检测方面,RGI,尤其是 R-RGI,呈现出较强的 Dice 得分以及相对 SOTA 基线的竞争力/领先的 AUROC。
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。