[论文解读] AugMix: A Simple Data Processing Method to Improve Robustness and Uncertainty
AugMix 引入了一种随机数据增强方案,该方案混合多条增强链并通过 Jensen-Shannon 散度损失强制一致性,在数据分布偏移下在 CIFAR 和 ImageNet 规模上实现了最先进的鲁棒性和不确定性估计。
Modern deep neural networks can achieve high accuracy when the training distribution and test distribution are identically distributed, but this assumption is frequently violated in practice. When the train and test distributions are mismatched, accuracy can plummet. Currently there are few techniques that improve robustness to unforeseen data shifts encountered during deployment. In this work, we propose a technique to improve the robustness and uncertainty estimates of image classifiers. We propose AugMix, a data processing technique that is simple to implement, adds limited computational overhead, and helps models withstand unforeseen corruptions. AugMix significantly improves robustness and uncertainty measures on challenging image classification benchmarks, closing the gap between previous methods and the best possible performance in some cases by more than half.
研究动机与目标
- Motivate robustness and reliable uncertainty estimation under train-test distribution shift.
- Develop a simple, computationally efficient data augmentation method.
- Demonstrate that mixing augmentation chains plus a consistency loss improves corruption robustness and calibration across benchmarks.
提出的方法
- Define AugMix as stochastic augmentation of an input through multiple randomly chosen augmentation chains.
- Mix the results of several augmentation chains using Dirichlet-distributed weights and interpolate with the original image using a Beta-distributed weight.
- Train with a Jensen-Shannon divergence consistency loss across the original and augmented variants to promote stable predictions.
- Exclude augmentations overlapping with test-time corrupted distributions (ImageNet-C) to ensure independence between train-time augmentations and test-time corruptions.
- Evaluate robustness using CIFAR-10-C, CIFAR-100-C, and ImageNet-C, along with uncertainty calibration metrics such as RMS calibration error and Brier score.
实验结果
研究问题
- RQ1Can a stochastic augmentation strategy improve model robustness to unseen corruptions without sacrificing clean accuracy?
- RQ2Does a Jensen-Shannon divergence consistency loss improve calibration under distribution shift?
- RQ3How does mixing multiple augmentation chains compare with single augmentations or other augmentation strategies in terms of robustness and uncertainty?
- RQ4Is AugMix scalable from CIFAR-scale datasets to ImageNet-scale datasets in both robustness and uncertainty estimation?
主要发现
- AugMix substantially reduces corruption error across CIFAR-10-C and CIFAR-100-C across multiple architectures.
- On ImageNet-C, AugMix achieves state-of-the-art corruption robustness and improves perturbation stability (mFR) compared with baselines.
- AugMix improves uncertainty calibration, reducing miscalibration (RMS calibration error and Brier score) under data shift.
- Ablations show that diversity from random augmentations, the Jensen-Shannon consistency loss, and mixing contribute to robustness, with diminishing returns if over-mixed or over-tuned.
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。