QUICK REVIEW

[论文解读] Closed-Form Factorization of Latent Semantics in GANs

Yujun Shen, Bolei Zhou|arXiv (Cornell University)|Jul 13, 2020

Generative Adversarial Networks and Image Synthesis参考文献 27被引用 63

一句话总结

介绍 SeFa，一种闭式、无监督的方法，通过分解第一层变换权重来发现 GAN 的潜在语义方向，从而在不进行训练或数据采样的情况下实现多样化的图像编辑。

ABSTRACT

A rich set of interpretable dimensions has been shown to emerge in the latent space of the Generative Adversarial Networks (GANs) trained for synthesizing images. In order to identify such latent dimensions for image editing, previous methods typically annotate a collection of synthesized samples and train linear classifiers in the latent space. However, they require a clear definition of the target attribute as well as the corresponding manual annotations, limiting their applications in practice. In this work, we examine the internal representation learned by GANs to reveal the underlying variation factors in an unsupervised manner. In particular, we take a closer look into the generation mechanism of GANs and further propose a closed-form factorization algorithm for latent semantic discovery by directly decomposing the pre-trained weights. With a lightning-fast implementation, our approach is capable of not only finding semantically meaningful dimensions comparably to the state-of-the-art supervised methods, but also resulting in far more versatile concepts across multiple GAN models trained on a wide range of datasets.

研究动机与目标

揭示 GAN 在无监督或无数据采样情形下学习的潜在语义方向。
分析 GAN 生成器的第一投影步骤，以识别影响较大的潜在因子。
展示所发现的语义在多种 GAN 架构和数据集上的泛化能力。

提出的方法

将 GAN 生成器建模为逐层投影的序列，并聚焦于第一步仿射变换 G1(z)=Az+b。
将无监督优化表达为在单位向量 n 上最大化 ||An||2，以找到在第一投影后能引起较大变化的语义方向 n。
通过求解 A^T A 的前 k 个特征向量，将其扩展为 k 个方向。
结论是最优方向是 A^T A 的前特征向量（SeFa）。
通过使用目标层或 StyleGAN 家族中链接层的权重，将 SeFa 应用于多种 GAN 架构（PGGAN、StyleGAN、StyleGAN2、BigGAN）。

实验结果

研究问题

RQ1在没有标注数据或属性预测器的情况下，是否能发现潜在语义方向？
RQ2在 GAN 中暴露有意义的潜在语义的最小、基于模型权重的机制是什么？
RQ3所发现的方向是否在不同的 GAN 架构和数据集上具有泛化性？
RQ4无监督的 SeFa 方向在编辑质量和多样性方面与有监督方法相比如何？

主要发现

SeFa 通过分解生成器中的第一线性变换，识别出多样化、易于人类解释的潜在方向。
所发现的方向形成一个分层的、依层次相关的结构，与基于 StyleGAN 的模型中的先前观测一致。
SeFa 在若干属性上实现的编辑能力可与有监督方法相比，且完全无数据和标签。
SeFa 展示了比某些有监督方法更广泛的语义集合，使得对那些不易被二元预测器覆盖的属性也能进行操控。
定性分析和用户研究表明，在某些情况下，SeFa 指向的编辑比某些基于采样的基线更好地保留了身份和其他属性。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。