QUICK REVIEW

[論文レビュー] Statistical Properties of the log-cosh Loss Function Used in Machine Learning

Resve Saleh, A. K. Ehsanes Saleh|arXiv (Cornell University)|Aug 9, 2022

Statistical Methods and Inference被引用数 26

ひとこと要約

この論文はログコス損失の統計的性質を導出し、基盤となるモデルとしてコス分布を同定し、正規分布とコーシー分布とを比較する。さらにロバスト性の利点と分位点回帰への応用も示す。

ABSTRACT

This paper analyzes a popular loss function used in machine learning called the log-cosh loss function. A number of papers have been published using this loss function but, to date, no statistical analysis has been presented in the literature. In this paper, we present the distribution function from which the log-cosh loss arises. We compare it to a similar distribution, called the Cauchy distribution, and carry out various statistical procedures that characterize its properties. In particular, we examine its associated pdf, cdf, likelihood function and Fisher information. Side-by-side we consider the Cauchy and Cosh distributions as well as the MLE of the location parameter with asymptotic bias, asymptotic variance, and confidence intervals. We also provide a comparison of robust estimators from several other loss functions, including the Huber loss function and the rank dispersion function. Further, we examine the use of the log-cosh function for quantile regression. In particular, we identify a quantile distribution function from which a maximum likelihood estimator for quantile regression can be derived. Finally, we compare a quantile M-estimator based on log-cosh with robust monotonicity against another approach to quantile regression based on convolutional smoothing.

研究の動機と目的

統計的観点から log-cosh 損失関数の研究を動機づけ、正当化する。
log-cosh 損失に対応するコス分布とその最尤推定(MLE)を導出する。
log-cosh MLE の漸近的なバイアス、分散、信頼区間を分析する。
log-cosh をロバストな代替案（Huber、順位ベースの方法）および LSE と比較する。
連続的な log-cosh ベースのチェック関数を用いた分位点回帰の応用を示す。

提案手法

log-cosh 損失を rho_L(x, theta)=log(cosh(x - theta)) と定義する。
コス分布を pdf f(x; theta, sigma)=1/(pi sigma cosh((x-theta)/sigma))として導出する。
MLE を sum_i tanh(x_i - theta)=0 を解くことで計算し、二階微分 sech^2(x) による凸性を示す。
Fisher情報 I(theta)=1/(2 sigma^2) と漸近的分散 Var(hat{theta})=2 sigma^2/n を計算する。
漸近特性を正規分布とコーシー分布とに対して比較し、直感的分析を通じて L1/L2 損失への関係を示す。
分位点回帰へ拡張し、連続的な log-cosh ベースのチェック関数と SMRQ を導入し、その Fisher 情報の導出とブートストラップによる標準誤差推定を含める。

実験結果

リサーチクエスチョン

RQ1log-cosh 損失が生じる統計分布は何であり、それはコーシー分布とどう比較されるのか。
RQ2位置パラメータに対する log-cosh MLE の漸近的性質（バイアス、分散、信頼区間）はどうなるか。
RQ3推定と標準誤差において log-cosh はロバストな代替案（Huber、順位ベース）とどう比較されるか。
RQ4跨り問題を回避するために分位点回帰で log-cosh をどう活用できるか、対応する MLE と分布は何か。
RQ5畳み込み平滑化は分位点の交差と単調性の問題を SMRQ に対してどう比較されるか。

主な発見

ログコス損失はコス分布に対応し、 pdf は 1/(pi sigma cosh((x-theta)/sigma))。
theta の MLE は sum_i tanh(x_i - theta)=0 を満たし、グローバルに凸であり、漸近的な Var(hat{theta}) = 2 sigma^2/n。
漸近的には log-cosh 推定量はバイアスがゼロに近く、信頼区間は Fisher 情報 I(theta)=1/(2 sigma^2) を用いた標準的な漸近正規理論に従う。
log-cosh は L1 ベースの方法と同等のロバスト性を示しつつ、L1 の不連続な導関数とは異なり連続的な一階および二階導関数を提供し、いくつかの設定で Huber より滑らかな挙動を示す。
分位点回帰では、連続的な log-cosh ベースのチェック関数 rho_S(x, tau) = log(cosh(x)) + (tau - 1/2)x がスムーズな M-estimator を生み出し、従来のチェック関数の鋭点問題を回避する。
ブートストラップは漸近分散の結果を検証し、異なる theta と sigma に対して log-cosh 推定の一貫性を支持する。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。