QUICK REVIEW

[論文レビュー] StyleBank: An Explicit Representation for Neural Image Style Transfer

Dongdong Chen, Lu Yuan|arXiv (Cornell University)|Mar 27, 2017

Generative Adversarial Networks and Image Synthesis参考文献 36被引用数 74

ひとこと要約

StyleBank は複数の畳み込みフィルターバンクを使用して明示的なスタイル表現を導入し、共有オートエンコーダーを介してスケーラブルで段階的、領域特異的なニューラルスタイル転送を実現します。

ABSTRACT

We propose StyleBank, which is composed of multiple convolution filter banks and each filter bank explicitly represents one style, for neural image style transfer. To transfer an image to a specific style, the corresponding filter bank is operated on top of the intermediate feature embedding produced by a single auto-encoder. The StyleBank and the auto-encoder are jointly learnt, where the learning is conducted in such a way that the auto-encoder does not encode any style information thanks to the flexibility introduced by the explicit filter bank representation. It also enables us to conduct incremental learning to add a new image style by learning a new filter bank while holding the auto-encoder fixed. The explicit style representation along with the flexible network design enables us to fuse styles at not only the image level, but also the region level. Our method is the first style transfer network that links back to traditional texton mapping methods, and hence provides new understanding on neural style transfer. Our method is easy to train, runs in real-time, and produces results that qualitatively better or at least comparable to existing methods.

研究の動機と目的

Decouple content and style in neural style transfer to enable multiple styles within a single model.
Introduce an explicit style representation by learning style-specific filter banks (StyleBank).
Enable incremental learning to add new styles without retraining the auto-encoder.
Allow region-specific and style-fusion transfers for flexible stylization.

提案手法

Use a shared image auto-encoder (encoder E and decoder D) to map content to a feature space.
Introduce StyleBank K consisting of multiple filter banks, each representing a style, applied to intermediate features F via convolution to obtain stylized features.
Train with two branches: auto-encoder branch (I -> E -> D) and stylizing branch (I -> E -> K -> D) using separate losses.
Losses include identity loss L_I for the auto-encoder and perceptual loss L_K composed of content loss L_c, style loss L_s, and total variation loss L_tv, computed with a pre-trained VGG-16.
Adopt a two-branch alternating training strategy to balance learning between content fidelity and stylization.
Support incremental learning by fixing E and D and training new style filter banks K_i; enable linear and region-based style fusion.

実験結果

リサーチクエスチョン

RQ1
RQ2How can styles be encoded explicitly to decouple content and style in neural style transfer?
RQ3Can a single network learn multiple styles simultaneously and support incremental addition of new styles?
RQ4Can region-specific style transfer be achieved by leveraging an explicit style representation?
RQ5What are the effects and mechanisms of linear and region-based style fusion in StyleBank?

主な発見

StyleBank represents each style with a convolution filter bank; different channels in a bank correspond to style elements (texton-like bases).
The auto-encoder learns content representation independent of style, enabling decoupled, multi-style learning within one network.
Incremental training can add new styles by only updating new filter banks, with significantly faster training than retraining the whole network (~8 minutes for a new style in Titan X setup).
Region-specific style transfer and linear fusion of styles are naturally supported by the explicit StyleBank representation and feature-space decomposition.

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。