QUICK REVIEW

[論文レビュー] Cross-Entropy Loss Functions: Theoretical Analysis and Applications

Anqi Mao, Mehryar Mohri|arXiv (Cornell University)|Apr 14, 2023

Adversarial Robustness in Machine Learning被引用数 213

ひとこと要約

この論文は、クロスエントロピーを含む広範な comp-sum 損失ファミリーに対して非漸近的な H-整合性境界を提案し、滑らかな対抗的バリアントを拡張して頑健性を向上させる。理論的保証と広範な実験評価の双方を提供する。

ABSTRACT

Cross-entropy is a widely used loss function in applications. It coincides with the logistic loss applied to the outputs of a neural network, when the softmax is used. But, what guarantees can we rely on when using cross-entropy as a surrogate loss? We present a theoretical analysis of a broad family of loss functions, comp-sum losses, that includes cross-entropy (or logistic loss), generalized cross-entropy, the mean absolute error and other cross-entropy-like loss functions. We give the first $H$-consistency bounds for these loss functions. These are non-asymptotic guarantees that upper bound the zero-one loss estimation error in terms of the estimation error of a surrogate loss, for the specific hypothesis set $H$ used. We further show that our bounds are tight. These bounds depend on quantities called minimizability gaps. To make them more explicit, we give a specific analysis of these gaps for comp-sum losses. We also introduce a new family of loss functions, smooth adversarial comp-sum losses, that are derived from their comp-sum counterparts by adding in a related smooth term. We show that these loss functions are beneficial in the adversarial setting by proving that they admit $H$-consistency bounds. This leads to new adversarial robustness algorithms that consist of minimizing a regularized smooth adversarial comp-sum loss. While our main purpose is a theoretical analysis, we also present an extensive empirical analysis comparing comp-sum losses. We further report the results of a series of experiments demonstrating that our adversarial robustness algorithms outperform the current state-of-the-art, while also achieving a superior non-adversarial accuracy.

研究の動機と目的

Provide non-asymptotic, hypothesis-set-specific guarantees for cross-entropy as a surrogate loss.
Characterize a broad family of comp-sum losses that includes logistic and generalized cross-entropy.
Introduce smooth adversarial comp-sum losses and establish H-consistency bounds in adversarial settings.
Develop adversarial robustness algorithms by minimizing regularized smooth adversarial comp-sum losses.
Empirically compare comp-sum losses on standard datasets and assess robustness and non-adversarial accuracy.

提案手法

Define comp-sum losses as Phi1 composed with a sum of Phi2 differences of scores, covering logistic, generalized cross-entropy, and mean absolute error.
Introduce Phi^tau family to parametrize comp-sum losses and derive their properties (concavity, monotonicity, Lipschitz).
Prove H-consistency bounds for symmetric and complete hypothesis sets, with a transformation Gamma_tau linking surrogate and 0-1 losses.
Analyze minimizability gaps to obtain explicit, tight bounds and compare loss functions via these gaps.
Extend to adversarial robustness by defining smooth adversarial comp-sum losses and proving H-consistency bounds under local rho-consistency.
Conduct empirical analyses comparing comp-sum losses and adversarial robustness algorithms on CIFAR-10, CIFAR-100, and SVHN.]
research_questions: ["What non-asymptotic, hypothesis-set-specific guarantees can be established for cross-entropy as a surrogate loss?", "How do H-consistency bounds extend to a broad family of comp-sum losses including logistic and generalized cross-entropy?", "What is the role of minimizability gaps in these bounds, and how do they compare across losses?", "Can smooth adversarial comp-sum losses provide provable H-consistency bounds and improved adversarial robustness?", "Do empirical results on standard datasets support the theoretical advantages of comp-sum losses and their adversarial variants?"]
key_findings: [

実験結果

リサーチクエスチョン

RQ1What non-asymptotic, hypothesis-set-specific guarantees can be established for cross-entropy as a surrogate loss?
RQ2How do H-consistency bounds extend to a broad family of comp-sum losses including logistic and generalized cross-entropy?
RQ3What is the role of minimizability gaps in these bounds, and how do they compare across losses?
RQ4Can smooth adversarial comp-sum losses provide provable H-consistency bounds and improved adversarial robustness?
RQ5Do empirical results on standard datasets support the theoretical advantages of comp-sum losses and their adversarial variants?

主な発見

First H-consistency bounds are derived for the logistic loss within multi-class classification.
Bounds are expressed via a Gamma_tau transformation and minimizability gaps that depend on the loss and hypothesis set.
The bounds are shown to be tight through a constructive argument.
Smooth adversarial comp-sum losses admit H-consistency bounds and enable adversarial robustness algorithms.
Empirical analyses show adversarial algorithms based on smooth adversarial comp-sum losses outperform state-of-the-art baselines and improve non-adversarial accuracy.
The minimizability gaps and their behavior across tau values (including logistic and mean absolute error) are characterized and used to compare losses.

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。