QUICK REVIEW

[論文レビュー] Image Data Augmentation Approaches: A Comprehensive Survey and Future directions

Teerath Kumar, Alessandra Mileo|arXiv (Cornell University)|Jan 7, 2023

Advanced Neural Network Applications被引用数 15

ひとこと要約

この調査は、画像データ拡張技術の総合的な分類を提示し、画像分類、物体検出、セマンティックセグメンテーションへの影響を評価し、技術の再現可能なコードを提供します。

ABSTRACT

Deep learning (DL) algorithms have shown significant performance in various computer vision tasks. However, having limited labelled data lead to a network overfitting problem, where network performance is bad on unseen data as compared to training data. Consequently, it limits performance improvement. To cope with this problem, various techniques have been proposed such as dropout, normalization and advanced data augmentation. Among these, data augmentation, which aims to enlarge the dataset size by including sample diversity, has been a hot topic in recent times. In this article, we focus on advanced data augmentation techniques. we provide a background of data augmentation, a novel and comprehensive taxonomy of reviewed data augmentation techniques, and the strengths and weaknesses (wherever possible) of each technique. We also provide comprehensive results of the data augmentation effect on three popular computer vision tasks, such as image classification, object detection and semantic segmentation. For results reproducibility, we compiled available codes of all data augmentation techniques. Finally, we discuss the challenges and difficulties, and possible future direction for the research community. We believe, this survey provides several benefits i) readers will understand the data augmentation working mechanism to fix overfitting problems ii) results will save the searching time of the researcher for comparison purposes. iii) Codes of the mentioned data augmentation techniques are available at https://github.com/kmr2017/Advanced-Data-augmentation-codes iv) Future work will spark interest in research community.

研究の動機と目的

データ拡張がCVモデルの過剰適合を緩和する理由を説明する。
基本的な拡張技術と高度な拡張技術を区別する総合的な分類法を提案する。
最新の拡張手法とそれらがCVタスクに与える影響を調査する。
評価対象の拡張技術の再現可能なコードを提供する。

提案手法

2系統分類法を提案する：BasicとAdvancedの画像データ拡張。
幾何的拡張、非幾何的拡張、消去拡張を例とともに整理・説明する。
高度な拡張を、画像混合、半教師あり、その他のイノベーションに分類する。
拡張技術の再現性のためのコードを整理・提供する。

Figure 1: Overfitting problem: On the left side, overfitting is explained in terms of accuracy, after the inflation point (red dotted line), the training accuracy is increasing but validation accuracy is decreasing. On the right side, alternatively in terms of loss, training loss is decreasing but v

実験結果

リサーチクエスチョン

RQ1現在の最先端の画像データ拡張技術は何か？
RQ2さまざまな拡張手法は、画像分類、物体検出、セマンティックセグメンテーションにどのような影響を与えるか？
RQ3各拡張技術の長所と限界は何か？
RQ4統一された分類法は、CVタスク間の再現性と比較を促進できるか？

主な発見

拡張技術の総合的な分類法が提案され、図示されている。
高度な拡張には画像混合、顕性（サリエンシー）配慮手法、複数画像戦略が含まれる。
拡張技術は画像分類、物体検出、セマンティックセグメンテーションのそれぞれで評価されている。
検討された拡張のコードは整備され、再現性のために提供されている。
データ拡張の課題と今後の方向性について調査が言及されている。

Figure 3: Overview of the geometric data augmentations.

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。