QUICK REVIEW

[论文解读] Implicit Semantic Data Augmentation for Deep Networks

Yulin Wang, Xuran Pan|arXiv (Cornell University)|Sep 26, 2019

Multimodal Machine Learning Applications被引用 60

一句话总结

ISDA 在隐式地通过沿着类别条件语义方向扰动深度特征来增强训练，使用在线协方差估计来推导鲁棒的交叉熵损失，从而在不进行额外数据生成的情况下提升泛化能力。

ABSTRACT

In this paper, we propose a novel implicit semantic data augmentation (ISDA) approach to complement traditional augmentation techniques like flipping, translation or rotation. Our work is motivated by the intriguing property that deep networks are surprisingly good at linearizing features, such that certain directions in the deep feature space correspond to meaningful semantic transformations, e.g., adding sunglasses or changing backgrounds. As a consequence, translating training samples along many semantic directions in the feature space can effectively augment the dataset to improve generalization. To implement this idea effectively and efficiently, we first perform an online estimate of the covariance matrix of deep features for each class, which captures the intra-class semantic variations. Then random vectors are drawn from a zero-mean normal distribution with the estimated covariance to augment the training data in that class. Importantly, instead of augmenting the samples explicitly, we can directly minimize an upper bound of the expected cross-entropy (CE) loss on the augmented training set, leading to a highly efficient algorithm. In fact, we show that the proposed ISDA amounts to minimizing a novel robust CE loss, which adds negligible extra computational cost to a normal training procedure. Although being simple, ISDA consistently improves the generalization performance of popular deep models (ResNets and DenseNets) on a variety of datasets, e.g., CIFAR-10, CIFAR-100 and ImageNet. Code for reproducing our results is available at https://github.com/blackfeather-wang/ISDA-for-Deep-Networks.

研究动机与目标

激发超越标准增强的语义变换，以提高图像分类的泛化能力。
提出一种高效的隐式增强机制，在特征空间中运行，而不是生成显式的增强样本。
开发一个封闭形式的上界损失，使其能够与现有架构进行可扩展优化。
证明 ISDA 在多种架构上对 CIFAR-10/100 和 ImageNet 均有持续的性能提升。

提出的方法

在训练过程中在线估计类别条件特征协方差。
在特征空间中从 N(0, lambda * Sigma_y) 采样随机方向，并在这些方向上概念性地平移特征。
推导在数据增强下预期交叉熵损失的封闭形式上界，得到鲁棒损失 ar{L}_infty。
使用 SGD 优化鲁棒损失，而无需显式生成增强样本。
提供算法1 详细说明 ISDA 步骤，并在补充材料中概述协方差估计。

实验结果

研究问题

RQ1是否可以利用深度特征空间中的隐式语义方向在不显式生成样本的情况下对训练数据进行有意义的增强？
RQ2类别条件协方差能否捕捉到提升跨数据集和架构泛化能力的类内语义变异？
RQ3ISDA 与如 Cutout 或 AutoAugment 等非语义增强的交互如何？
RQ4在大规模数据集上，ISDA 损失是否能以几乎零额外计算开销实现？

主要发现

ISDA 在 CIFAR-10、CIFAR-100 和 ImageNet 上的 ResNet、DenseNet 及相关架构上持续带来泛化提升。
与非语义增强如 Cutout 和 AutoAugment 结合时，ISDA 提供显著改进。
该方法可实现为鲁棒损失，额外计算量极小，且当 lambda 趋近于零时退化为标准交叉熵。
在报告的实验中，ISDA 经常优于最先进的鲁棒损失和基于GAN的语义增强方法。
消融研究表明，使用完整的类别条件协方差（相较于单位矩阵/对角矩阵或单一全局协方差）对于有效性很重要。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。