QUICK REVIEW

[论文解读] Learning From Noisy Labels By Regularized Estimation Of Annotator Confusion

Ryutaro Tanno, Ardavan Saeedi|arXiv (Cornell University)|Feb 10, 2019

Machine Learning and Data Classification参考文献 38被引用 31

一句话总结

该论文提出了一种简单而有效的方法，通过正则化项联合估计分类器预测和个体标注者混淆矩阵，以鼓励低秩、最大不可靠的混淆矩阵，从而从噪声标签中学习。该方法在图像分类任务中优于最先进（SOTA）的方法，即使每张图像仅有一个标签也表现优异，并能准确恢复标注者技能和标签噪声，尤其在对角 dominance 较低和标注稀疏的情况下表现更佳。

ABSTRACT

The predictive performance of supervised learning algorithms depends on the quality of labels. In a typical label collection process, multiple annotators provide subjective noisy estimates of the "truth" under the influence of their varying skill-levels and biases. Blindly treating these noisy labels as the ground truth limits the accuracy of learning algorithms in the presence of strong disagreement. This problem is critical for applications in domains such as medical imaging where both the annotation cost and inter-observer variability are high. In this work, we present a method for simultaneously learning the individual annotator model and the underlying true label distribution, using only noisy observations. Each annotator is modeled by a confusion matrix that is jointly estimated along with the classifier predictions. We propose to add a regularization term to the loss function that encourages convergence to the true annotator confusion matrix. We provide a theoretical argument as to how the regularization is essential to our approach both for the case of single annotator and multiple annotators. Despite the simplicity of the idea, experiments on image classification tasks with both simulated and real labels show that our method either outperforms or performs on par with the state-of-the-art methods and is capable of estimating the skills of annotators even with a single label available per image.

研究动机与目标

解决由于标注者技能水平和偏差差异导致标签噪声时，训练准确模型的挑战。
在不依赖多数投票或大量标签冗余的情况下，联合估计真实标签分布和个体标注者混淆矩阵。
开发一种理论基础扎实且实际操作简单的模型，仅需在标准交叉熵损失中添加一个正则化项。
即使在每例仅有一个标签的高成本领域（如医学影像）中，也能实现对标签噪声的准确建模。

提出的方法

在训练过程中，将每个标注者建模为与分类器联合估计的混淆矩阵。
在损失函数中引入一个正则化项，通过最小化估计混淆矩阵的迹来鼓励最大不可靠性，从而促进收敛到真实的噪声模式。
使用交叉熵损失将分类器拟合到噪声标签，同时通过正则化混淆矩阵来避免对噪声的过拟合。
该方法通过向标准交叉熵损失添加单一正则化项实现，易于集成到现有的深度学习流水线中。
理论分析表明，正则化对于恢复真实标注者混淆矩阵至关重要，尤其当平均混淆矩阵具有对角 dominance 时。
该方法避免了基于EM的迭代优化，相比传统联合估计方法，训练速度更快且更稳定。

实验结果

研究问题

RQ1一种基于正则化的简单方法是否能在无需迭代EM优化的情况下，联合估计真实标签分布和个体标注者混淆矩阵？
RQ2当每例仅有一个标签时，该方法在恢复标注者技能和标签噪声方面的有效性如何？
RQ3通过混淆矩阵对个体标注者进行建模，是否能比忽略标注者特异性噪声的方法带来更高的分类准确率？
RQ4在何种条件下——尤其是平均混淆矩阵的对角 dominance 情况下——正则化项能确保真实噪声模式的一致性恢复？
RQ5在模拟和真实世界噪声标注场景中，该方法与MBEM和广义EM等最先进方法相比，在性能和鲁棒性方面表现如何？

主要发现

在MNIST和CIFAR-10数据集上使用模拟噪声标签时，该方法在分类准确率上达到或优于最先进方法（如MBEM和广义EM），尤其在标签冗余较低时表现更优。
当每张图像仅有一个标签时，该方法保持高性能，而基线方法的准确率显著下降。
在真实的心脏视图分类数据集（来自超声影像）上，该方法在分类准确率和混淆矩阵估计质量方面均优于MBEM。
估计的混淆矩阵能成功区分专家与非专家标注者，展现出清晰的错误模式，例如A3C与A5C视图之间频繁混淆。
通过计算学习到的混淆矩阵的平均对角值，该方法能准确恢复标注者的技能水平，结果与专家直觉一致。
理论分析确认，正则化对于在平均混淆矩阵具有对角 dominance 时一致恢复标注者混淆矩阵至关重要。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。