QUICK REVIEW

[论文解读] RSGAN: Face Swapping and Editing using Face and Hair Representation in Latent Spaces

Ryota Natsume, Tatsuya Yatagawa|arXiv (Cornell University)|Apr 10, 2018

Face recognition and analysis参考文献 15被引用 43

一句话总结

RSGAN 引入区域可分离的 VAE（面部和头发）再加一个 GAN 用于在潜在空间中交换面部和编辑属性，实现鲁棒的脸部交换和灵活的编辑，无需对每对进行微调。

ABSTRACT

In this paper, we present an integrated system for automatically generating and editing face images through face swapping, attribute-based editing, and random face parts synthesis. The proposed system is based on a deep neural network that variationally learns the face and hair regions with large-scale face image datasets. Different from conventional variational methods, the proposed network represents the latent spaces individually for faces and hairs. We refer to the proposed network as region-separative generative adversarial network (RSGAN). The proposed network independently handles face and hair appearances in the latent spaces, and then, face swapping is achieved by replacing the latent-space representations of the faces, and reconstruct the entire face image with them. This approach in the latent space robustly performs face swapping even for images which the previous methods result in failure due to inappropriate fitting or the 3D morphable models. In addition, the proposed system can further edit face-swapped images with the same network by manipulating visual attributes or by composing them with randomly generated face or hair parts.

研究动机与目标

为自动脸部交换和外观编辑提出一个统一系统的动机。
提出一个区域分离的 GAN，学习面部区域和头发区域的独立潜在空间。
通过交换潜在表示并重建完整图像来实现脸部交换。
在同一网络中支持基于属性的编辑和随机部件合成。
在不同姿态、光照和表情的情境下展示鲁棒性，且避免逐对微调。

提出的方法

两个 VAE（分离器网络）将面部和头发的外观编码到独立的潜在空间(z_f, z_h)。
一个基于 GAN 的作曲网络从成对的潜在代码重建完整图像。
训练使用三个重建损失来强调前景细节，对脸部、头发和整图像进行重建，并使用背景掩码。
KL散度损失对潜在空间进行正则化；来自全局和局部判别器的对抗损失引导真实感。
一个分类器网络从输入图像估计可视属性，以实现基于属性的编辑。
在脸部交换时，来自两个输入的潜在代码被组合为 x′ = G(z_xf, z_cf, z_xh, z_ch)。
可选的梯度域拼接可用于细化头发/背景的一致性（RSGAN-GD）。
数据集通过对 CelebA 的面部/头发区域进行分割并提取训练用的补丁来构建。

实验结果

研究问题

RQ1将面部和头发潜在表示分离是否能在姿态、光照和表情变化下提高脸部交换的鲁棒性和质量？
RQ2同一潜在空间框架是否能够在不需要额外逐对微调的情况下支持基于属性的编辑和随机部件合成？
RQ3与以往方法相比，区域可分离建模如何影响身份保留和交换一致性？
RQ4使用变分潜在空间与非变分编码器在这些任务中的影响是什么？

主要发现

RSGAN 在多样的姿势和光照条件下实现了自然看起来自然的脸部交换结果。
视觉属性可以通过操作相应的潜在代码来编辑，实现对脸部或头发的定向变化而不产生交叉影响。
对脸部或头发潜在空间的随机采样可以在保持另一区域的同时生成新的外观。
RSGAN 在报告的指标上展示了有竞争力的交换一致性，并优于若干基线生成模型，尽管某些情况下基于专门的 3DMM 方法可能保持更高的身份保真度。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。