QUICK REVIEW

[論文レビュー] Style Augmentation: Data Augmentation via Style Randomization

Philip T. Jackson, Amir Atapour–Abarghouei|arXiv (Cornell University)|Sep 14, 2018

Topic Modeling参考文献 36被引用数 91

ひとこと要約

高速なランダムなスタイル転送パイプラインを用いてテクスチャ・色・コントラストをランダム化するスタイル拡張を導入; ドメインシフトへのロバスト性を向上させ、従来のデータ拡張を補完できる。

ABSTRACT

We introduce style augmentation, a new form of data augmentation based on random style transfer, for improving the robustness of convolutional neural networks (CNN) over both classification and regression based tasks. During training, our style augmentation randomizes texture, contrast and color, while preserving shape and semantic content. This is accomplished by adapting an arbitrary style transfer network to perform style randomization, by sampling input style embeddings from a multivariate normal distribution instead of inferring them from a style image. In addition to standard classification experiments, we investigate the effect of style augmentation (and data augmentation generally) on domain transfer tasks. We find that data augmentation significantly improves robustness to domain shift, and can be used as a simple, domain agnostic alternative to domain adaptation. Comparing style augmentation against a mix of seven traditional augmentation techniques, we find that it can be readily combined with them to improve network performance. We validate the efficacy of our technique with domain transfer experiments in classification and monocular depth estimation, illustrating consistent improvements in generalization.

研究の動機と目的

CNNにおけるドメインバイアスと過学習を動機づけ、速くドメイン非依存なデータ拡張技術を導入する。
形状と内容を保ちながらテクスチャ/色/コントラストをランダム化するスタイル拡張を提案する。
スタイル拡張が画像分類、クロスドメイン分類、単眼深度推定に与える影響を評価する。
実務者向けにPyTorch実装をオープンソース化する。

提案手法

入力を変換するためにリアルタイムのニューラル美術風スタイライズネットワーク（Ghiasi ら 2017）を用いる。
スタイル予測器を、Painter By Numbersの埋め込みに適合させた多変量正規分布から100次元のスタイル埋め込みzをサンプリングする方法に置き換える。
スタイル転送中に特徴マップを調整するため、zに条件付けされた条件付きインスタンス正規化を適用する。
ランダムスタイル埋め込みと入力自身のスタイルとの間にαを介して制御された補間を導入し、拡張強度を調整する。
拡張のみと伝統的な拡張と組み合わせた場合を、複数のタスクとアーキテクチャで評価する。
拡張比率と強さを選択するためのハイパーパラメータ研究を提供する。

実験結果

リサーチクエスチョン

RQ1ターゲットドメインデータなしで、スタイル拡張は未知のドメインへの一般化を改善するか？
RQ2ランダム化されたスタイル転送は、タスク（分類、ドメイン転移、深度推定）全般で実用的なドメイン非依存データ拡張戦略となり得るか？
RQ3スタイル拡張は従来の拡張とどのように相互作用して性能を向上させるか？
RQ4スタイル拡張の実用的なトレーニングコストと実装上の考慮点は何か？

主な発見

タスク	モデル	拡張アプローチ	なし	Trad	Style	Both
AW→D	InceptionV3	None	0.789	0.890	0.882	0.952
AW→D	ResNet18	None	0.399	0.704	0.495	0.873
AW→D	ResNet50	None	0.488	0.778	0.614	0.922
AW→D	VGG16	None	0.558	0.830	0.551	0.870
DW→A	InceptionV3	None	0.183	0.160	0.254	0.286
DW→A	ResNet18	None	0.113	0.128	0.147	0.229
DW→A	ResNet50	None	0.130	0.156	0.170	0.244
DW→A	VGG16	None	0.086	0.149	0.111	0.243

スタイル拡張は複数のアーキテクチャでドメイン転送ベンチマークの精度を大幅に向上させる。
STL-10では、スタイル拡張だけで収束と最終精度が改善され、7つの伝統的拡張と組み合わせると精度が8.5%向上する。
Officeのクロスドメイン分類では、スタイル拡張はしばしば伝統的拡張を上回るか補完し、組み合わせると最高の最終精度を達成する。
単眼深度推定では、スタイル拡張で学習したモデルは、従来の拡張のみで学習したモデルより実世界データへより一般化する。
スタイル拡張は一般にImageNetで性能を低下させる。大規模データセットでテクスチャ手がかりを除去すると精度が低下するというテクスチャバイアスの発見と一致している。
この手法は、最適な拡張比率でSTL-10の場合など、比較的少ないトレーニング時間オーバーヘッドで追加可能（例：約6%の増加）。
手法は既存の拡張法と補完的であり、ドメインバイアスを緩和するための単純でドメイン非依存の戦略として機能する。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。