QUICK REVIEW

[论文解读] An Active Learning Algorithm for Ranking from Pairwise Preferences with an Almost Optimal Query Complexity

Nir Ailon|arXiv (Cornell University)|Oct 30, 2010

Machine Learning and Algorithms参考文献 44被引用 79

一句话总结

本文提出了一种基于成对偏好关系的排序主动学习算法，通过将原始问题简化为可通过任意经验风险最小化（ERM）黑箱求解的简化问题，实现了近乎最优的查询复杂度。该方法仅自适应地采样 $O(n\operatorname{polylog}(n,\varepsilon^{-1}))$ 个成对标签，确保最终解的损失至多为最优损失的 $(1+\varepsilon)$ 倍，显著优于基于 VC 理论的非自适应采样策略。

ABSTRACT

We study the problem of learning to rank from pairwise preferences, and solve a long-standing open problem that has led to development of many heuristics but no provable results for our particular problem. Given a set $V$ of $n$ elements, we wish to linearly order them given pairwise preference labels. A pairwise preference label is obtained as a response, typically from a human, to the question "which if preferred, u or v?$ for two elements $u,v\in V$. We assume possible non-transitivity paradoxes which may arise naturally due to human mistakes or irrationality. The goal is to linearly order the elements from the most preferred to the least preferred, while disagreeing with as few pairwise preference labels as possible. Our performance is measured by two parameters: The loss and the query complexity (number of pairwise preference labels we obtain). This is a typical learning problem, with the exception that the space from which the pairwise preferences is drawn is finite, consisting of ${n\choose 2}$ possibilities only. We present an active learning algorithm for this problem, with query bounds significantly beating general (non active) bounds for the same error guarantee, while almost achieving the information theoretical lower bound. Our main construct is a decomposition of the input s.t. (i) each block incurs high loss at optimum, and (ii) the optimal solution respecting the decomposition is not much worse than the true opt. The decomposition is done by adapting a recent result by Kenyon and Schudy for a related combinatorial optimization problem to the query efficient setting. We thus settle an open problem posed by learning-to-rank theoreticians and practitioners: What is a provably correct way to sample preference labels? To further show the power and practicality of our solution, we show how to use it in concert with an SVM relaxation.

研究动机与目标

解决排序学习领域长期存在的开放问题：在不损害损失界的前提下，实现成对偏好标签的可证明高效采样。
将原始的具有 $\binom{n}{2}$ 种可能成对比较的排序问题，简化为可通过标准 ERM 黑箱求解的更简单问题。
在保持损失在最优解的 $(1+\varepsilon)$ 倍以内的前提下，实现 $O(n\operatorname{polylog}(n,\varepsilon^{-1}))$ 的查询复杂度。
证明自适应采样在相同遗憾界下，其查询复杂度显著优于非自适应采样。
为基于成对偏好的排序任务提供一种实用且可证明正确的主动学习框架。

提出的方法

该算法通过一种保持最优解损失在 $(1+\varepsilon)$ 因子内的变换，将排序问题转化为另一种学习问题。
采用自适应标签采样，根据当前模型的不确定性选择成对比较，以最小化查询次数。
该简化方法确保，将任意 ERM 黑箱应用于简化后的问题，所得解的损失至多为原始问题最优损失的 $(1+\varepsilon)$ 倍。
该方法利用最小反馈弧集问题（MFAST）的完全多项式时间近似方案（PTAS）来指导问题简化与标签选择。
通过在非传递三角形上使用打包论证，对支持向量机公式的目标函数进行下界估计。
利用 Hoeffding 不等式建立浓度界，确保在采样子集上的经验风险以高概率近似真实风险，误差在 $O(\varepsilon F_2(w))$ 以内。

实验结果

研究问题

RQ1我们能否设计一种基于成对偏好的排序主动学习算法，实现近乎最优的查询复杂度，并具备可证明的损失保证？
RQ2与非自适应采样相比，自适应采样在成对偏好排序中是否显著更高效？
RQ3我们能否将原始排序问题简化为可通过标准 ERM 黑箱求解的更简单问题，同时保持损失界？
RQ4为实现排序问题的 $(1+\varepsilon)$-近似解，所需的最少成对查询数量是多少？
RQ5在相同遗憾水平下，基于 VC 维的界与自适应采样在查询复杂度上的表现如何比较？

主要发现

所提出的算法对任意 $\varepsilon > 0$，实现了 $O(n\operatorname{polylog}(n,\varepsilon^{-1}))$ 的查询复杂度，接近最优。
该方法以高概率确保最终排序的损失至多为最优损失的 $(1+\varepsilon)$ 倍。
问题简化为更简单形式后，可使用任意 ERM 黑箱，所得解在 $(1+\varepsilon)$ 因子内近似原始问题的最优解。
基于 VC 维的非自适应采样策略在相同遗憾水平下，其查询复杂度界显著更差。
理论分析表明，当使用 $M = O(\varepsilon^{-6}(1+2c)^2 d \log(1/\varepsilon))$ 个样本时，采样子集上的经验风险以高概率近似真实风险，误差在 $O(\varepsilon F_2(w))$ 以内。
通过基于 SVM 的 ERM 黑箱进行实证验证，结果表明简化后的问题可严格近似原始 SVM 解，且误差在任意期望范围内。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。