QUICK REVIEW

[論文レビュー] Sever: A Robust Meta-Algorithm for Stochastic Optimization

Ilias Diakonikolas, Gautam Kamath|arXiv (Cornell University)|Mar 7, 2018

Machine Learning and Algorithms参考文献 41被引用数 67

ひとこと要約

Sever は、任意のベース学習器の周りにラップされた場合、勾配データのトップ特異ベクトルを用いて外れ値を検出・除去する堅牢なメタアルゴリズムであり、スパムと薬剤設計タスクで強い理論と実用的なスケーラブル性を示しています。

ABSTRACT

In high dimensions, most machine learning methods are brittle to even a small fraction of structured outliers. To address this, we introduce a new meta-algorithm that can take in a base learner such as least squares or stochastic gradient descent, and harden the learner to be resistant to outliers. Our method, Sever, possesses strong theoretical guarantees yet is also highly scalable -- beyond running the base learner itself, it only requires computing the top singular vector of a certain $n \times d$ matrix. We apply Sever on a drug design dataset and a spam classification dataset, and find that in both cases it has substantially greater robustness than several baselines. On the spam dataset, with $1\%$ corruptions, we achieved $7.4\%$ test error, compared to $13.4\%-20.5\%$ for the baselines, and $3\%$ error on the uncorrupted dataset. Similarly, on the drug design dataset, with $10\%$ corruptions, we achieved $1.42$ mean-squared error test error, compared to $1.51$-$2.33$ for the baselines, and $1.23$ error on the uncorrupted dataset.

研究の動機と目的

Address robustness to arbitrary high-dimensional outliers in stochastic optimization.
Provide a general, scalable framework applicable to regression, classification, and non-convex models.
Offer theoretical guarantees that do not depend on problem dimension.
Demonstrate practical effectiveness on real-world datasets (spam and drug design).

提案手法

Run a base learner on a possibly corrupted dataset to obtain a parameter w.
Compute the per-point gradients at w and form the centered gradient matrix G.
Compute the top singular vector v of G to capture the dominant gradient direction.
Define outlier scores tau_i as the squared projection of each centered gradient onto v.
Filter out high-scoring points and re-run the learning procedure; iterate until no more points are removed.
Provide theoretical guarantees that Sever yields a gamma-approximate critical point of the true objective under mild conditions; show near-optimal sample complexity and robustness without dependence on dimension.

実験結果

リサーチクエスチョン

RQ1Can Sever provide robustness guarantees for stochastic optimization with an ε-fraction of arbitrary outliers?
RQ2How does Sever perform for common learning tasks like regression and classification under contamination?
RQ3Does Sever offer a dimension-independent error guarantee and practical scalability on real data?

主な発見

Sever achieves robustness to arbitrary outliers with a dimension-free error term under mild heavy-tailedness assumptions (Theorem 2.1).
On the Enron spam dataset with 1% corruptions, Sever attains 7.4% test error versus 13.4–20.5% for baselines (3% without corruptions).
On the drug design dataset with 10% corruptions, Sever achieves 1.42 mean-squared error test error versus 1.51–2.33 for baselines (1.23 on uncorrupted data).
The method is practically scalable, requiring only computing the top singular vector of an n×d gradient matrix and simple filtering steps.
Sever outperforms several natural baseline outlier detectors in experiments across regression and classification tasks.
The paper provides concrete applications to generalized linear models with near-optimal sample complexity under ε-corruption.]
table_headers:[]
table_rows:[]}

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。