QUICK REVIEW

[论文解读] Image Data Augmentation Approaches: A Comprehensive Survey and Future directions

Teerath Kumar, Alessandra Mileo|arXiv (Cornell University)|Jan 7, 2023

Advanced Neural Network Applications被引用 15

一句话总结

本综述提出了一个关于图像数据增强技术的全面分类，对其在图像分类、目标检测和语义分割中的影响进行评估，并提供可复现的代码用于这些技术。

ABSTRACT

Deep learning (DL) algorithms have shown significant performance in various computer vision tasks. However, having limited labelled data lead to a network overfitting problem, where network performance is bad on unseen data as compared to training data. Consequently, it limits performance improvement. To cope with this problem, various techniques have been proposed such as dropout, normalization and advanced data augmentation. Among these, data augmentation, which aims to enlarge the dataset size by including sample diversity, has been a hot topic in recent times. In this article, we focus on advanced data augmentation techniques. we provide a background of data augmentation, a novel and comprehensive taxonomy of reviewed data augmentation techniques, and the strengths and weaknesses (wherever possible) of each technique. We also provide comprehensive results of the data augmentation effect on three popular computer vision tasks, such as image classification, object detection and semantic segmentation. For results reproducibility, we compiled available codes of all data augmentation techniques. Finally, we discuss the challenges and difficulties, and possible future direction for the research community. We believe, this survey provides several benefits i) readers will understand the data augmentation working mechanism to fix overfitting problems ii) results will save the searching time of the researcher for comparison purposes. iii) Codes of the mentioned data augmentation techniques are available at https://github.com/kmr2017/Advanced-Data-augmentation-codes iv) Future work will spark interest in research community.

研究动机与目标

解释为什么数据增强有助于缓解CV模型的过拟合。
提出一个全面的分类法，区分基础与高级增强技术。
综述最前沿的增强方法及其对CV任务的影响。
提供评估过的增强技术的可复现代码。

提出的方法

提出一个两分支分类法：基础与高级图像数据增强。
编目并描述几何、非几何以及擦除增强及示例。
将高级增强分为图像混合、半监督和其他创新。
汇编并提供可复现性代码。

Figure 1: Overfitting problem: On the left side, overfitting is explained in terms of accuracy, after the inflation point (red dotted line), the training accuracy is increasing but validation accuracy is decreasing. On the right side, alternatively in terms of loss, training loss is decreasing but v

实验结果

研究问题

RQ1当前最先进的图像数据增强技术有哪些？
RQ2不同的增强方法如何影响图像分类、目标检测和语义分割？
RQ3每种增强技术的优点与局限性是什么？
RQ4统一的分类法是否能促进跨CV任务的可重复性和比较性？

主要发现

提出并展示了一个关于增强技术的综合分类法。
高级增强包括图像混合、显著性感知方法和多图像策略。
在图像分类、目标检测和语义分割等任务上对增强技术进行了评估。
对所考察的增强技术的代码已汇编并提供以便复现。
该综述讨论了数据增强的挑战与未来方向。

Figure 3: Overview of the geometric data augmentations.

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。