QUICK REVIEW

[论文解读] GAN Dissection: Visualizing and Understanding Generative Adversarial Networks

David Bau, Jun-Yan Zhu|arXiv (Cornell University)|Nov 26, 2018

Generative Adversarial Networks and Image Synthesis参考文献 42被引用 182

一句话总结

本论文提出一个框架，通过识别可解释单元、通过干预衡量其因果影响，以及探索上下文关系来诊断、比较和改进 GANs。

ABSTRACT

Generative Adversarial Networks (GANs) have recently achieved impressive results for many real-world applications, and many GAN variants have emerged with improvements in sample quality and training stability. However, they have not been well visualized or understood. How does a GAN represent our visual world internally? What causes the artifacts in GAN results? How do architectural choices affect GAN learning? Answering such questions could enable us to develop new insights and better models. In this work, we present an analytic framework to visualize and understand GANs at the unit-, object-, and scene-level. We first identify a group of interpretable units that are closely related to object concepts using a segmentation-based network dissection method. Then, we quantify the causal effect of interpretable units by measuring the ability of interventions to control objects in the output. We examine the contextual relationship between these units and their surroundings by inserting the discovered object concepts into new images. We show several practical applications enabled by our framework, from comparing internal representations across different layers, models, and datasets, to improving GANs by locating and removing artifact-causing units, to interactively manipulating objects in a scene. We provide open source interpretation tools to help researchers and practitioners better understand their GAN models.

研究动机与目标

通过基于分割的解剖来识别与对象概念对应的 GAN 生成器中的单元。
通过干预（消融和插入）量化已识别单元对对象存在的因果影响。
研究对象概念与周围场景之间的上下文关系，以理解插入效应。
展示在不同层、模型和数据集之间比较表示的应用。
提供开源工具以支持 GAN 的解释与调试。

提出的方法

解剖：在上采样和阈值化后，使用 IoU 测量单个生成器单元与语义分割图之间的一致性。
对每个单元相对于一组语义概念计算 IoU，以标注可解释的单元。
干预：对选定的一组单元进行消融（置零）或插入（设为一个常数），使用分割掩模计算对象存在的平均因果效应（ACE）。
优化连续干预向量 alpha，以识别对目标概念具有最大 ACE 的最小单元集合。
在生成器的特征图中选定的像素位置 P 处应用插入/消融以评估因果性。
在不同层、GAN 变体和数据集之间进行比较，以揭示表示如何演化以及如何减轻伪影。

实验结果

研究问题

RQ1GANs 是否在其特征图中发展出明确表示对象概念的内部单元（例如树、桌子）？
RQ2特定单元组对生成图像中对象的存在与否的因果影响有多大？
RQ3周围对象的上下文如何影响通过单元干预插入或移除对象概念的成功率？
RQ4干预是否能够通过定位造成伪影的单元并实现有针对性的消融来诊断并指导改进？
RQ5结构选择和数据集变体如何影响可解释单元的出现？

主要发现

条件	FID	人工偏好
原始图像	43.16	-
伪影去除（我们的方法）	27.14	72.4%
随机单元去除	43.17	49.9%

某些单元对应显式的对象概念；例如，layer4 的一个单元在 LSUN outdoor 场景中定位树，IoU 高达 0.34。
中后层（4–7，随后为 10+）日益编码语义对象/部件和低级特征，而早期层仍然纠缠。
小批量 stddev 和像素级归一化将与语义类匹配的单元数量提升了 19–40% 以上。
去除造成伪影的单元可以显著提升图像质量，FID 从 43.16 降至 27.14，人类偏好在伪影去除图像中提升至 72.4%（对比随机消融的 49.9%）。
一小组经过精心挑选的单元（例如 20 个单元）即可在会议室场景中移除常见对象如人物、窗帘和窗户，且上下文会影响移除的难易程度。
插入实验表明在合理位置可以添加门，但上下文常常否决插入（例如不可在天空中或树木上）。
该框架促进跨模型/数据集的比较，并为通过定位可解释单元来进行调试和改进 GAN 提供路径。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。