QUICK REVIEW

[論文レビュー] A Comprehensive Survey of Image Augmentation Techniques for Deep Learning

Mingle Xu, Sook Yoon|arXiv (Cornell University)|May 3, 2022

Domain Adaptation and Few-Shot Learning被引用数 28

ひとこと要約

深層学習の画像拡張手法に関する、モデルフリー、モデルベース、ポリシーベースのアプローチを網羅する、課題と近傍分布の分析を含む、手法選択と今後の研究を導く広範な分類学的サーベイ。

ABSTRACT

Deep learning has been achieving decent performance in computer vision requiring a large volume of images, however, collecting images is expensive and difficult in many scenarios. To alleviate this issue, many image augmentation algorithms have been proposed as effective and efficient strategies. Understanding current algorithms is essential to find suitable methods or develop novel techniques for given tasks. In this paper, we perform a comprehensive survey on image augmentation for deep learning with a novel informative taxonomy. To get the basic idea why we need image augmentation, we introduce the challenges in computer vision tasks and vicinity distribution. Then, the algorithms are split into three categories; model-free, model-based, and optimizing policy-based. The model-free category employs image processing methods while the model-based method leverages trainable image generation models. In contrast, the optimizing policy-based approach aims to find the optimal operations or their combinations. Furthermore, we discuss the current trend of common applications with two more active topics, leveraging different ways to understand image augmentation, such as group and kernel theory, and deploying image augmentation for unsupervised learning. Based on the analysis, we believe that our survey gives a better understanding helpful to choose suitable methods or design novel algorithms for practical applications.

研究の動機と目的

視覚タスクにおける深層学習のために画像拡張がなぜ必要かを説明する。
画像拡張手法の新しい有益な分類学を提示する。
コンピュータビジョンにおける課題と近傍分布の概念を分析し、拡張の必要性を正当化する。
モデルフリー、モデルベース、および最適化ポリシーに基づく拡張アプローチを調査する。
拡張の理解、新戦略、特徴拡張などの将来の方向性と関連トピックを議論する。

提案手法

拡張手法を3つの主要カテゴリー、モデルフリー、モデルベース、および最適化ポリシーに基づくに分類する。
モデルフリーを単一画像と複数画像（インスタンスレベルおよび非インスタンスレベル）の拡張に細分化し、詳細な手法例を示す。
モデルベースを無条件、ラベル条件、画像条件（ラベル保持とラベル変更）拡張に分け、代表的なGAN系およびスタイル/翻訳法を用いた例を示す。
強化学習と敵対学習を用いて自動的に最適な拡張戦略を探索する最適化ポリシーに基づく拡張を説明する。
拡張がデータ多様体をどのように拡張し、一般化を改善するかを説明する近傍分布の概念を導入する。

実験結果

リサーチクエスチョン

RQ1視覚タスクにおける深層学習のための画像拡張手法にはどのようなカテゴリーが存在するか？
RQ2モデルフリー、モデルベース、ポリシーべースの拡張は、目的と方法の点でどう異なるか？
RQ3拡張が解決を目指すコンピュータビジョンの主要な課題は何か、そして近傍分布は拡張とどう関連するか？
RQ4特定のタスクに対して拡張手法を選択または設計する際の指針は何か？
RQ5画像拡張研究の将来の方向性と関連トピックは何か？

主な発見

本調査は、従来の手法を超える広範な拡張アルゴリズムを含む総合的な分類法を提示する。
モデルフリー、モデルベース、最適化ポリシーに基づくアプローチを区別し、単一画像、複数画像、インスタンスレベル、非インスタンスレベル拡張などのサブカテゴリーを詳述する。
インスタンスレベル混合を含む最近の手法を統合し、拡張のためのラベル条件付きおよびラベル変更型の画像生成の洞察を提供する。
近傍分布が拡張の有効性と一般化を理解するうえでの役割を説明する。
拡張戦略の選択に関する実践的考慮事項を議論し、拡張の理解と特徴拡張の将来の方向性を概説する。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。