QUICK REVIEW

[论文解读] StyleBank: An Explicit Representation for Neural Image Style Transfer

Dongdong Chen, Lu Yuan|arXiv (Cornell University)|Mar 27, 2017

Generative Adversarial Networks and Image Synthesis参考文献 36被引用 74

一句话总结

StyleBank 通过使用多组卷积滤波器组引入显式的风格表示，使得通过共享自编码器实现可扩展、增量式和区域特定的神经风格迁移。

ABSTRACT

We propose StyleBank, which is composed of multiple convolution filter banks and each filter bank explicitly represents one style, for neural image style transfer. To transfer an image to a specific style, the corresponding filter bank is operated on top of the intermediate feature embedding produced by a single auto-encoder. The StyleBank and the auto-encoder are jointly learnt, where the learning is conducted in such a way that the auto-encoder does not encode any style information thanks to the flexibility introduced by the explicit filter bank representation. It also enables us to conduct incremental learning to add a new image style by learning a new filter bank while holding the auto-encoder fixed. The explicit style representation along with the flexible network design enables us to fuse styles at not only the image level, but also the region level. Our method is the first style transfer network that links back to traditional texton mapping methods, and hence provides new understanding on neural style transfer. Our method is easy to train, runs in real-time, and produces results that qualitatively better or at least comparable to existing methods.

研究动机与目标

在神经风格迁移中解耦内容和风格，以在单一模型中实现多种风格。
通过学习面向风格的滤波器组（StyleBank）来引入显式的风格表示。
实现增量学习，在不重新训练自编码器的情况下添加新风格。
允许区域特定和风格融合的迁移以实现灵活的风格化。

提出的方法

使用一个共享的图像自编码器（编码器 E 和解码器 D）将内容映射到特征空间。
引入由多个滤波器组组成的 StyleBank K，每个组代表一种风格，通过卷积应用于中间特征 F 以获得风格化特征。
使用两条分支进行训练：自编码器分支（I -> E -> D）和风格化分支（I -> E -> K -> D），并使用独立损失。
损失包括自编码器的恒等损失 L_I，以及由内容损失 L_c、风格损失 L_s 和总变差损失 L_tv 组成的感知损失 L_K，使用预训练的 VGG-16 计算。
采用两分支交替训练策略，在内容保真度和风格化之间进行平衡。
通过固定 E 和 D、训练新的风格滤波器组 K_i 实现增量学习；实现线性和区域基础的风格融合。

实验结果

研究问题

RQ1如何显式编码风格以在神经风格迁移中解耦内容和风格？
RQ2单一网络是否能够同时学习多种风格并支持新增风格的增量？
RQ3是否可以通过利用显式的风格表示实现区域特定的风格迁移？
RQ4在 StyleBank 中线性融合和区域基的风格融合的效果及机制是什么？

主要发现

StyleBank 用卷积滤波器组表示每种风格；滤波器组中的不同通道对应风格要素（类似 texton 的基元）。
自编码器学习与风格无关的内容表示，使一个网络实现解耦的多风格学习。
增量训练通过仅更新新的滤波器组来添加新风格，训练速度比重新训练整个网络快得多（在 Titan X 设置中新风格大约 8 分钟）。
区域特定的风格迁移和风格的线性融合通过显式的 StyleBank 表示和特征空间分解自然支持。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。