QUICK REVIEW

[论文解读] Self-Adaptive Training: beyond Empirical Risk Minimization

Lang Huang, Chao Zhang|arXiv (Cornell University)|Feb 24, 2020

Machine Learning and Data Classification参考文献 54被引用 88

一句话总结

本论文提出自适应训练（Self-Adaptive Training），一种利用累计的模型预测来动态调整训练目标和样本权重的方法，在不增加额外计算成本的前提下提高对标签和输入噪声的鲁棒性。它在嘈杂和对抗性设置下优于经验风险最小化（ERM），并且在标签噪声分类和选择性分类方面带来改进。

ABSTRACT

We propose self-adaptive training---a new training algorithm that dynamically corrects problematic training labels by model predictions without incurring extra computational cost---to improve generalization of deep learning for potentially corrupted training data. This problem is crucial towards robustly learning from data that are corrupted by, e.g., label noises and out-of-distribution samples. The standard empirical risk minimization (ERM) for such data, however, may easily overfit noises and thus suffers from sub-optimal performance. In this paper, we observe that model predictions can substantially benefit the training process: self-adaptive training significantly improves generalization over ERM under various levels of noises, and mitigates the overfitting issue in both natural and adversarial training. We evaluate the error-capacity curve of self-adaptive training: the test error is monotonously decreasing w.r.t. model capacity. This is in sharp contrast to the recently-discovered double-descent phenomenon in ERM which might be a result of overfitting of noises. Experiments on CIFAR and ImageNet datasets verify the effectiveness of our approach in two applications: classification with label noise and selective classification. We release our code at https://github.com/LayneH/self-adaptive-training.

研究动机与目标

在训练数据被随机噪声或对抗性噪声污染，超出标准ERM时，激发鲁棒学习。
提出一种训练机制，利用模型预测来引导训练动态。
在CIFAR/CIFAR-ImageNet上的标签噪声和对抗攻击下展示改进的泛化能力。
展示在选择性分类中的适用性并分析鲁棒性与效率。

提出的方法

引入模型预测的指数移动平均，用以逐步纠正训练目标。
将样本权重计算为更新后目标的最大条目，以反映标注置信度。
使用重新加权的交叉熵损失进行训练，并以总样本权重进行归一化。
以几乎零额外成本维持与现有结构和训练流程的兼容性。

实验结果

研究问题

RQ1能否动态地使用模型预测来校正训练信号，从而减轻对噪声数据的过拟合？
RQ2相对于ERM，自适应训练在随机和对抗性噪声下是否能降低泛化误差？
RQ3该方法是否能在标签噪声分类和选择性分类等应用中提升性能？

主要发现

自适应训练在嘈杂数据上减轻过拟合，并在多种噪声类型和水平下带来低于ERM的泛化误差。
在随机噪声下，该方法产生单调下降的误差-容量曲线，与ERM中观察到的双下降不同。
在白盒PGD攻击下，该方法在TRADES之上提高了1–3%的对抗鲁棒性。
在CIFAR-10/100的标签噪声下，相比先前的方法，准确率最高提升9.3个百分点；在ImageNet中，在40%标签噪声下，对ERM的提升约2%。
在选择性分类方面，在不同数据集和覆盖率水平上实现了相对于现有方法的最高50%相对提升。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。