QUICK REVIEW

[论文解读] Harmonic Unpaired Image-to-image Translation

Rui Zhang, Tomas Pfister|arXiv (Cornell University)|Feb 26, 2019

Generative Adversarial Networks and Image Synthesis参考文献 44被引用 30

一句话总结

该论文提出HarmonicGAN，一种新颖的无配对图像到图像翻译框架，通过在样本图上施加调和函数正则化，强制实现平滑、一致的映射，显著减少伪影并提升翻译质量。在无需额外监督的情况下，其在医学影像、物体变换和语义标注任务中均优于CycleGAN及SOTA方法，在医学任务中将MSE降低一半，并在95%的案例中获得放射科医生的偏好。

ABSTRACT

The recent direction of unpaired image-to-image translation is on one hand very exciting as it alleviates the big burden in obtaining label-intensive pixel-to-pixel supervision, but it is on the other hand not fully satisfactory due to the presence of artifacts and degenerated transformations. In this paper, we take a manifold view of the problem by introducing a smoothness term over the sample graph to attain harmonic functions to enforce consistent mappings during the translation. We develop HarmonicGAN to learn bi-directional translations between the source and the target domains. With the help of similarity-consistency, the inherent self-consistency property of samples can be maintained. Distance metrics defined on two types of features including histogram and CNN are exploited. Under an identical problem setting as CycleGAN, without additional manual inputs and only at a small training-time cost, HarmonicGAN demonstrates a significant qualitative and quantitative improvement over the state of the art, as well as improved interpretability. We show experimental results in a number of applications including medical imaging, object transfiguration, and semantic labeling. We outperform the competing methods in all tasks, and for a medical imaging task in particular our method turns CycleGAN from a failure to a success, halving the mean-squared error, and generating images that radiologists prefer over competing methods in 95% of cases.

研究动机与目标

解决现有无配对图像到图像翻译方法的局限性，特别是伪影和退化变换问题。
提出一种基于流形的方法，利用基于图的调和函数在不同域之间强制实现平滑、一致的映射。
通过相似性一致性正则化保持样本的自一致性，无需人工监督。
在与CycleGAN相同的无配对设置下，提升定性和定量性能，且训练成本极低。
在多样化应用中实现最先进性能，包括先前方法失效的挑战性医学影像任务。

提出的方法

构建一个样本图，其中节点代表图像，边代表基于特征的相似性，使用直方图和CNN特征。
在图上定义平滑性项，以强制实现调和函数，确保映射的一致性和无伪影。
将调和正则化集成到GAN框架中，学习源域与目标域之间的双向翻译。
利用相似性一致性，在翻译过程中保留数据点的内在自一致性。
使用直方图和深度CNN特征的距离度量计算图的边权重，增强特征保真度。
端到端训练模型，结合对抗性损失与调和正则化，最小化额外训练开销。

实验结果

研究问题

RQ1在样本图上强制实施调和函数是否能提升无配对图像到图像翻译的一致性和质量？
RQ2与现有方法相比，所提出的调和正则化在减少伪影和退化变换方面效果如何？
RQ3相似性一致性特性在无需人工监督的情况下，能在多大程度上增强自一致翻译？
RQ4该方法是否能在CycleGAN失效的挑战性领域（如医学影像）中实现最先进性能？
RQ5该方法是否能提升各类图像翻译任务中的可解释性和泛化能力？

主要发现

HarmonicGAN显著减少伪影并提升翻译质量，在所有评估任务中均优于CycleGAN及其他SOTA方法。
在医学影像中，与CycleGAN相比，HarmonicGAN将均方误差降低50%，使此前失败的方法变为成功。
在95%的案例中，放射科医生更偏好HarmonicGAN生成的图像，表明其具有强临床相关性。
该方法在所有基准测试中均取得优越的定量结果，且无需额外人工输入或显著增加训练成本。
调和正则化的整合使图像翻译更具可解释性和一致性，尤其在语义标注等复杂领域表现更优。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。