Skip to main content
QUICK REVIEW

[论文解读] Rediscovering BCE Loss for Uniform Classification

Qiufu Li, Xi Jia|arXiv (Cornell University)|Mar 12, 2024
Currency Recognition and Detection被引用 12
一句话总结

本文提出带统一阈值的统一分类,并推导出具有可学习偏置的基于 BCE 的损失,从而产生统一阈值,结果表明 BCE 损失在多个数据集和特征提取器上相较 SoftMax 提升了统一性和逐样准确性。

ABSTRACT

This paper introduces the concept of uniform classification, which employs a unified threshold to classify all samples rather than adaptive threshold classifying each individual sample. We also propose the uniform classification accuracy as a metric to measure the model's performance in uniform classification. Furthermore, begin with a naive loss, we mathematically derive a loss function suitable for the uniform classification, which is the BCE function integrated with a unified bias. We demonstrate the unified threshold could be learned via the bias. The extensive experiments on six classification datasets and three feature extraction models show that, compared to the SoftMax loss, the models trained with the BCE loss not only exhibit higher uniform classification accuracy but also higher sample-wise classification accuracy. In addition, the learned bias from BCE loss is very close to the unified threshold used in the uniform classification. The features extracted by the models trained with BCE loss not only possess uniformity but also demonstrate better intra-class compactness and inter-class distinctiveness, yielding superior performance on open-set tasks such as face recognition.

研究动机与目标

  • 引入统一分类的概念及其分类状况的统一性(uniformity)的概念,及其对每类别的均匀性评估
  • 定义并比较用于统一性的新的度量指标,包括统一分类准确度
  • 推导一种带可学习偏置的基于 BCE 的损失,强制实现统一阈值
  • 在不同数据集和模型下,证明 BCE 损失在统一与逐样任务上的优势,相较 SoftMax。

提出的方法

  • 定义统一分类及其度量(统一准确度、类别级统一准确度、逐样准确度)
  • 重新推导 SoftMax 与 BCE 损失,并将它们的偏置与特征统一性联系起来
  • 设计两种基于 BCE 的统一损失(L_bce-u 和 L_bce-d),具有可学习的偏置,收敛到统一阈值
  • 给出理论收敛性结果,表明在某些条件下偏置收敛到统一阈值
  • 在六个数据集和三种特征提取器上,使用线性与归一化分类器进行实证验证
Figure 1: The visual comparison of performance of ResNet50 trained by $L_{\text{soft-nu}}$ and $L_{\text{bce-nu}}$ with various $\gamma$ on ImageNet-1K. Although $L_{\text{bce-nu}}$ performs poorly when $\gamma$ is too small or too large, for $\gamma$ varying in $[32,192]$ , its uniform accuracy is
Figure 1: The visual comparison of performance of ResNet50 trained by $L_{\text{soft-nu}}$ and $L_{\text{bce-nu}}$ with various $\gamma$ on ImageNet-1K. Although $L_{\text{bce-nu}}$ performs poorly when $\gamma$ is too small or too large, for $\gamma$ varying in $[32,192]$ , its uniform accuracy is

实验结果

研究问题

  • RQ1单一、统一阈值是否在所有样本上有助于开放集与统一分类任务?
  • RQ2带可学习偏置的 BCE 损失是否比 SoftMax 损失产生更统一的特征和更高的统一分类准确度?
  • RQ3不同的基于 BCE 的统一损失在收敛性与在开放集与人脸识别任务上的性能方面有何差异?
  • RQ4学习得到的偏置在多大程度上对应统一分类中使用的统一阈值?
  • RQ5在 BCE 基础训练下,特征统一性、类内紧凑性和类间辨别力之间的关系是什么?

主要发现

  • 带有统一偏置的 BCE 损失能够学习统一的统一分类阈值
  • 用 BCE 损失训练的模型在多个数据集上实现了高于 SoftMax 损失的统一分类准确度
  • 基于 BCE 的损失中学习到的偏置与统一分类中使用的统一阈值高度接近
  • 来自 BCE 训练的特征显示出更好的统一性、类内紧凑性和类间可分性,对开放集任务(如人脸识别)有益
  • 提出两种基于 BCE 的统一损失(L_bce-u 和 L_bce-d),并在某些条件下证明了阈值的收敛性
Figure 2: The distributions of positive and negative classification metrics of ResNet50 trained by $L_{\text{soft-nu}}$ (left) and $L_{\text{bce-nu}}$ (right) on ImageNet-1K. The smaller overlap between the positive and negative metrics of $L_{\text{bce-u}}$ and $L_{\text{bce-nu}}$ indicates that th
Figure 2: The distributions of positive and negative classification metrics of ResNet50 trained by $L_{\text{soft-nu}}$ (left) and $L_{\text{bce-nu}}$ (right) on ImageNet-1K. The smaller overlap between the positive and negative metrics of $L_{\text{bce-u}}$ and $L_{\text{bce-nu}}$ indicates that th

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。