QUICK REVIEW

[论文解读] DRIT++: Diverse Image-to-Image Translation via Disentangled Representations

Hsin-Ying Lee, Hung-Yu Tseng|arXiv (Cornell University)|May 2, 2019

Generative Adversarial Networks and Image Synthesis参考文献 64被引用 185

一句话总结

DRIT++ 通过将内容（领域不变）与属性（领域特定）表示解耦来学习多模态、未配对的图像到图像翻译，从而实现多样、逼真的输出以及跨域翻译。

ABSTRACT

Image-to-image translation aims to learn the mapping between two visual domains. There are two main challenges for this task: 1) lack of aligned training pairs and 2) multiple possible outputs from a single input image. In this work, we present an approach based on disentangled representation for generating diverse outputs without paired training images. To synthesize diverse outputs, we propose to embed images onto two spaces: a domain-invariant content space capturing shared information across domains and a domain-specific attribute space. Our model takes the encoded content features extracted from a given input and attribute vectors sampled from the attribute space to synthesize diverse outputs at test time. To handle unpaired training data, we introduce a cross-cycle consistency loss based on disentangled representations. Qualitative results show that our model can generate diverse and realistic images on a wide range of tasks without paired training data. For quantitative evaluations, we measure realism with user study and Fréchet inception distance, and measure diversity with the perceptual distance metric, Jensen-Shannon divergence, and number of statistically-different bins.

研究动机与目标

解决图像到图像翻译中成对训练数据的缺乏。
在无监督的条件下，从单个输入实现多模态、多样化输出。
将表示解耦为领域不变的内容和领域特定的属性。
扩展到多域图像到图像翻译。
通过正则化和跨循环约束在不牺牲逼真度的情况下提升多样性。

提出的方法

使用编码器将图像嵌入到共享的内容空间和领域特定的属性空间。
使用内容判别器和权重共享来强制跨域共享内容空间。
通过在域间交换属性表示并重建输入来应用跨循环一致性损失。
加入模式寻求正则化以提升输出多样性。
将框架扩展到多域翻译，使用一个生成器和一个域分类器。
可选地使用学习到的属性向量进行示例引导的属性迁移。

实验结果

研究问题

RQ1是否可以通过解耦表示从未配对数据中学习出多样、现实的 I2I 翻译？
RQ2分离内容与属性是否能实现跨域及同域的多模态输出和属性迁移？
RQ3该方法能否扩展到单一生成器的多域 I2I 翻译？
RQ4内容判别器与模式寻求正则化对逼真度与多样性有何影响？

主要发现

DRIT++ 在多项未配对的 I2I 任务上实现了多样且逼真的翻译。
通过解耦表示的跨循环一致性使得能够从非对应的图像对中可靠重建。
模式寻求正则化显著提升多样性并缓解模式崩溃。
内容判别器减少共享内容空间中的领域特定泄露，使领域表示对齐。
使用单一生成器的多域翻译在若干域（真实图像与艺术风格、不同天气）上产生多样化结果。
定量指标（FID、LPIPS、JSD、NDB）显示 DRIT++ 在测试任务上优于若干基线。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。