QUICK REVIEW

[論文レビュー] A New Active Learning Scheme with Applications to Learning to Rank from Pairwise Preferences

Nir Ailon, Ron Begleiter|arXiv (Cornell University)|Oct 10, 2011

Machine Learning and Algorithms参考文献 18被引用数 9

ひとこと要約

本稿では、ペairwise preferenceからのrankingのための新しいアクティブラーニング方式を提案する。この方式は、O(n poly(log n, ε⁻¹))のクエリのみを用いて、最適損失への指数的収束を達成する。手法は、好みの問題が持つ構造的性質を利用しており、正確な設定および凸緩和設定（SVMやロジスティック回帰を含む）の両方に適用可能であり、低次元特徴空間においてクエリ効率が向上する。

ABSTRACT

We consider the statistical learning setting of active learning in which the learner chooses which examples to obtain labels for. We identify a useful general purpose structural property of such learning problems, giving rise to a query-efficient iterative procedure achieving approximately optimal loss at an exponentially fast rate, where the rate is measured in units of error per label. The effectiveness of our ideas is demonstrated on the problem of learning to rank from pairwise preference labels, known as minimum feedback arc-set in tournaments when all the quadratically many preferences are given as input. The net result is an efficient selective sampling method for this problem, achieving a (1 + e)competitive result using only O(n poly(logn, e−1)) preference queries from the quadratically many. This result is information theoretical in nature because it shows how to efficiently select information, not how to use it (computationally) for optimization. Nevertheless, our ideas transfer quite seamlessly to a convex relaxation counterpart, giving rise to an iterative algorithm with an exponential convergence rate to a relaxation optimum. SVM and logistic regression are, in particular, notable examples of relaxation for which this result applies. Such relaxations are popular in applications where the set of alternatives we wish to rank is embedded in a real vector space (feature space), and we wish to fit a permutation induced by a linear function to the preference information. Moreover, in the particular case of constant dimensional feature space, we obtain a slight additional improvement in the query complexity as a function of the number of alternatives using the powerful notion of e-relative approximations in bounded VC dimension spaces. We believe that our iterative scheme and analysis method are interesting in their own right and will find use in other problems. ∗Technion nailon@cs.technion.ac.il †Technion ronbeg@cs.technion.ac.il ‡NYU Courant Institute esther@cims.nyu.edu

研究の動機と目的

ペアワイズ好みフィードバックからのクエリ効率の良いアクティブラーニング手法の開発。
アクティブラーニング設定で高速収束を可能にする一般化された構造的性質の同定。
とくにフィードバックアーキスト問題の文脈において、最小限のラベルクエリでほぼ最適のパフォーマンスを達成すること。
実用的なランク付け応用を想定し、SVM やロジスティック回帰のような凸緩和へこの手法を拡張すること。
低次元特徴空間におけるクエリ複雑度を、e-相対近似を用いて向上させること。

提案手法

本手法は、ラーニング問題の構造的性質に基づく反復的アクティブラーニング手順を導入し、ラベルあたりの損失における指数的収束を保証する。
すべてのペアワイズ好みが利用可能な場合、ランク付け問題を最小フィードバックアーキスト問題として定式化する。
選択的サンプリングを用いて、冗長なクエリを最小限に抑える最も情報量の多いペアワイズ比較を特定する。
凸緩和に対しては、緩和最適解への指数的収束を達成する反復的アルゴリズムを適用する。
定数次元の特徴空間では、e-相対近似を活用して、さらにクエリ複雑度を低減する。
このフレームワークは一般性を有し、好みのデータから線形関数が順序付けを誘導する任意の問題に適用可能である。

実験結果

リサーチクエスチョン

RQ1ペアワイズフィードバックを伴うランク付け問題において、ラベルあたりの損失における指数的収束を達成する一般化されたアクティブラーニング方式を設計可能か？
RQ2好みの問題のどのような構造的性質が、アクティブラーニングにおける効率的なクエリ選択を可能にするか？
RQ3必要な好みクエリの数を最小限に抑えつつ、(1+ε)-競合性能を維持するにはどうすればよいか？
RQ4SVM やロジスティック回帰のような凸緩和は、このアクティブラーニング方式からどの程度恩恵を受けるか？
RQ5有界VC次元空間におけるe-相対近似は、低次元特徴設定におけるクエリ複雑度をさらに低減できるか？

主な発見

提案されたアクティブラーニング方式は、O(n poly(log n, ε⁻¹))の好みクエリのみを用いて(1+ε)-競合結果を達成し、必要なラベル数を顕著に削減する。
本手法は、ラベルあたりの損失における指数的収束を保証しており、ラベル使用の観点から非常に効率的である。
クエリ選択において情報理論的に最適であるが、計算最適化においては必ずしもそうではない。
SVM やロジスティック回帰のような凸緩和に対しては、緩和最適解への指数的収束を達成する反復的アルゴリズムを提供する。
定数次元の特徴空間では、e-相対近似を用いることでクエリ複雑度がさらに向上し、効率性が向上する。
本手法は一般性を有し、特徴空間における線形関数が好みを誘導する任意のランク付け問題に適用可能である。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。