QUICK REVIEW

[论文解读] RelGAN: Multi-Domain Image-to-Image Translation via Relative Attributes

Po-Wei Wu, Yujing Lin|arXiv (Cornell University)|Aug 20, 2019

Generative Adversarial Networks and Image Synthesis参考文献 29被引用 48

一句话总结

RelGAN 引入相对属性用以实现多域图像翻译，能够在不指定所有属性的情况下进行连续、目标化的编辑，并在真实感与插值方面优于以往基于目标属性的方法。

ABSTRACT

Multi-domain image-to-image translation has gained increasing attention recently. Previous methods take an image and some target attributes as inputs and generate an output image with the desired attributes. However, such methods have two limitations. First, these methods assume binary-valued attributes and thus cannot yield satisfactory results for fine-grained control. Second, these methods require specifying the entire set of target attributes, even if most of the attributes would not be changed. To address these limitations, we propose RelGAN, a new method for multi-domain image-to-image translation. The key idea is to use relative attributes, which describes the desired change on selected attributes. Our method is capable of modifying images by changing particular attributes of interest in a continuous manner while preserving the other attributes. Experimental results demonstrate both the quantitative and qualitative effectiveness of our method on the tasks of facial attribute transfer and interpolation.

研究动机与目标

通过使用相对属性变化来解决基于二元目标属性的多域翻译的局限性。
在保持非目标属性的同时，实现对属性编辑的持续、细粒度控制。
通过专门的判别器和损失项，提升原始图像与编辑后图像之间的插值质量。
在多份高质量数据集上展示在人脸属性转移、重建和插值方面的有效性。

提出的方法

用一个 n 维属性向量 a 表示域，并定义相对属性 v = âˆhat a - a 来指定期望的变化。
使用一个以 (x, v) 为条件的单一生成器 G，并配备三个判别器：Real（无条件的真实感）、Match（(x, v, x') 是否匹配该翻译），以及 Interp（预测插值程度）。
通过对抗损失进行训练以实现真实感（Real）、条件匹配（Match）（使用真实三元组和错误三元组）以及插值损失（Interp）来约束平滑的属性变化。
施加重建正则化：在 x 和 G(G(x, v), -v) 之间的循环重建 L1 损失，以及当 v = 0 时的自重建损失，以保持身份和背景细节。
引入一个插值判别器，预测 G(x, αv) 的插值度 α，以促进平滑、真实的过渡。
采用正交正则化项，并采用 LSGAN-GP 稳定化的训练；生成器中使用可切换归一化。

实验结果

研究问题

RQ1相对属性相对于二值目标属性，在多域图像翻译中，是否能够提供更细粒度、连续的属性控制？
RQ2模型如何有选择地仅修改感兴趣的属性，同时保留未改变的属性和整体身份信息？
RQ3增加插值判别器是否能提升属性插值的质量和光滑度？
RQ4RelGAN 在跨越多样数据集的人脸属性转移、重建和插值方面的经验收益有哪些？

主要发现

RelGAN 在 CelebA、CelebA-HQ 和 FFHQ 设置下，取得了低于 StarGAN 和 AttGAN 的 Fréchet Inception Distance（FID），表明视觉质量更高。
在多种属性上的生成图像分类准确性，RelGAN 最高，表明翻译属性的保真度更高。
RelGAN 比以往方法更有效地保留未改变的属性，并在原始与编辑后图像之间展示出更平滑、更加真实的插值。
消融研究表明，完整损失（Real + Match + Cycle/Self + Interp 与正交正则化）能带来最佳的重建和插值结果。
用户研究在大多数属性转移和重建任务中偏向 RelGAN，总体在多项任务中受欢迎。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。