QUICK REVIEW

[論文レビュー] An improper estimator with optimal excess risk in misspecified density estimation and logistic regression

Jaouad Mourtada, Stéphane Gaïffas|arXiv (Cornell University)|Dec 23, 2019

Machine Learning and Algorithms参考文献 108被引用数 11

ひとこと要約

本稿では、条件付き密度推定およびロジスティック回帰における不適切推定量であるサンプルミニマックス予測子（SMP）を導入し、モデルの不適合が生じても最適な過剰リスクを達成する。d/nに比例する新しい過剰リスクバインドを最小化することで、特に不適合設定下でMLEのような内部モデル推定量を上回り、非漸近的保証を提供する。ロジスティック回帰では、O((d + B²R²)/n)の過剰リスクを達成する。

ABSTRACT

We introduce a procedure for conditional density estimation under logarithmic loss, which we call SMP (Sample Minmax Predictor). This estimator minimizes a new general excess risk bound for statistical learning. On standard examples, this bound scales as $d/n$ with $d$ the model dimension and $n$ the sample size, and critically remains valid under model misspecification. Being an improper (out-of-model) procedure, SMP improves over within-model estimators such as the maximum likelihood estimator, whose excess risk degrades under misspecification. Compared to approaches reducing to the sequential problem, our bounds remove suboptimal $\log n$ factors and can handle unbounded classes. For the Gaussian linear model, the predictions and risk bound of SMP are governed by leverage scores of covariates, nearly matching the optimal risk in the well-specified case without conditions on the noise variance or approximation error of the linear model. For logistic regression, SMP provides a non-Bayesian approach to calibration of probabilistic predictions relying on virtual samples, and can be computed by solving two logistic regressions. It achieves a non-asymptotic excess risk of $O((d + B^2R^2)/n)$, where $R$ bounds the norm of features and $B$ that of the comparison parameter; by contrast, no within-model estimator can achieve better rate than $\min({B R}/{\sqrt{n}}, {d e^{BR}}/{n} )$ in general. This provides a more practical alternative to Bayesian approaches, which require approximate posterior sampling, thereby partly addressing a question raised by Foster et al. (2018).

研究の動機と目的

モデルの不適合が生じても有効なまま、有限標本における過剰リスクバインドを確立すること。
MLEのような内部モデル推定量を不適合設定下で上回る不適切推定量を提案すること。
従来の逐次予測に基づくバインドに見られる不適切なlog n要因を除去すること。
確率的キャリブレーションにおける事後分布サンプリングの代替として、非ベイズ的かつ計算効率の良い手法を提供すること。
ノイズ分散や近似誤差に関する仮定を必要とせず、ロジスティック回帰における最適な過剰リスクレートを達成すること。

提案手法

新しい一般過剰リスクバインドを最小化する不適切推定量として、サンプルミニマックス予測子（SMP）を提案する。
テスト点を訓練セットに追加し、2つのロジスティック回帰を解くことで仮想サンプルアプローチを用いる。
拡張データセット上でλ-正則化リスク最小化を実行し、予測子を定義する。
損失関数の擬似自己相乗性と正則化リスクの強い凸性を用いて安定性バインドを導出する。
交換可能性とトレース不等式を適用し、リスク差の期待値を制御する。
正則化リスクのヘッセ行列と行列凹性を活用し、レバレッジスコアの観点から過剰リスクをバインドする。

実験結果

リサーチクエスチョン

RQ1不適切推定量は、不適合密度推定において最適な過剰リスクを達成できるか？
RQ2提案されたSMP推定量は、逐次予測手法と比較して過剰リスクバインドにおける不適切なlog n要因を除去できるか？
RQ3SMPは、ロジスティック回帰におけるキャリブレートされた確率的予測のための非ベイズ的代替を提供できるか？
RQ4一般のモデル不適合下で、ロジスティック回帰におけるSMPの有限標本過剰リスクは何か？
RQ5レバレッジスコアは、ガウス線形モデルにおけるSMPのリスク行動にどのように影響するか？

主な発見

SMPはロジスティック回帰において、O((d + B²R²)/n)の過剰リスクを達成し、正しく適合された状況下での最適レートと一致する。
過剰リスクバインドはd/nに比例し、MLEとは異なりモデル不適合下でも有効であり、MLEのリスクは劣化する。
ガウス線形モデルでは、SMPのリスクはレバレッジスコアに支配され、ほぼ最適な適合済みリスクに一致する。
先行の逐次予測に基づくアプローチに見られる不適切なlog n要因がバインドから除去されている。
SMPは仮想サンプルを用いて、近似事後分布サンプリングを回避する非ベイズ的手法を提供する。
線形モデルにおけるノイズ分散や近似誤差に関する仮定を必要とせず、最適な過剰リスクを達成する。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。