Skip to main content
QUICK REVIEW

[论文解读] FaceShifter: Towards High Fidelity And Occlusion Aware Face Swapping

Lingzhi Li, Jianmin Bao|arXiv (Cornell University)|Dec 31, 2019
Face recognition and analysis参考文献 40被引用 210
一句话总结

FaceShifter 引入一个两阶段人脸换框架:AEI-Net 用于高保真合成,具自适应嵌入目标属性和身份信息;随后 HEAR-Net 进行自监督遮挡细化,达到更高保真度和身份保留。

ABSTRACT

In this work, we propose a novel two-stage framework, called FaceShifter, for high fidelity and occlusion aware face swapping. Unlike many existing face swapping works that leverage only limited information from the target image when synthesizing the swapped face, our framework, in its first stage, generates the swapped face in high-fidelity by exploiting and integrating the target attributes thoroughly and adaptively. We propose a novel attributes encoder for extracting multi-level target face attributes, and a new generator with carefully designed Adaptive Attentional Denormalization (AAD) layers to adaptively integrate the identity and the attributes for face synthesis. To address the challenging facial occlusions, we append a second stage consisting of a novel Heuristic Error Acknowledging Refinement Network (HEAR-Net). It is trained to recover anomaly regions in a self-supervised way without any manual annotations. Extensive experiments on wild faces demonstrate that our face swapping results are not only considerably more perceptually appealing, but also better identity preserving in comparison to other state-of-the-art methods.

研究动机与目标

  • Aim to improve fidelity and realism in face swapping while preserving source identity.
  • Infuse target image attributes (pose, expression, lighting, background) adaptively during synthesis.
  • Handle occlusions without manual annotations through self-supervised refinement.
  • Produce subject-agnostic swapping that works on new face pairs without per-subject training.

提出的方法

  • Adaptive Embedding Integration Network (AEI-Net) with a multi-level Attributes Encoder and an Adaptive Attentional Denormalization (AAD) Generator to integrate identity and target attributes.
  • Identity encoder extracts source identity; multi-level attributes encoder preserves spatial attribute information.
  • AAD layers perform adaptive denormalization with an attention mask to fuse identity and attributes across feature levels.
  • Two-stage pipeline: Stage 1 generates a high-fidelity swapped face; Stage 2 (HEAR-Net) refines occlusions using heuristic error guidance without manual annotations.

实验结果

研究问题

  • RQ1Can adaptive, multi-level attribute integration improve fidelity and lighting coherence in face swapping?
  • RQ2Does a self-supervised refinement stage effectively recover target occlusions and rare artifacts without extra labels?
  • RQ3Is the two-stage FaceShifter framework robust across wild faces and varied occlusions?

主要发现

  • FaceShifter achieves higher identity preservation and better target attribute fidelity than prior methods on FaceForensics++ datasets.
  • Quantitative results show our method attaining superior ID retrieval (97.38) and lower pose (2.96) and expression (2.06) errors compared to baselines.
  • User studies indicate substantial realism and identity/attribute alignment advantages for FaceShifter over existing methods.
  • HEAR-Net effectively recovers occlusions and color shifts, improving results for challenging occlusions and large pose variations.
  • The AEI-Net with multi-level attributes and adaptive fusion outperforms single-level or non-adaptive baselines.

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。