QUICK REVIEW

[论文解读] Policy Learning for Fairness in Ranking

Ashudeep Singh, Thorsten Joachims|arXiv (Cornell University)|Feb 11, 2019

Privacy-Preserving Technologies in Data参考文献 44被引用 41

一句话总结

本论文提出 Fair-PG-Rank，一种策略梯度框架，用于学习随机排序策略，在最大化用户效用的同时，在排名中强制按 merit 基于曝光公平性。

ABSTRACT

Conventional Learning-to-Rank (LTR) methods optimize the utility of the rankings to the users, but they are oblivious to their impact on the ranked items. However, there has been a growing understanding that the latter is important to consider for a wide range of ranking applications (e.g. online marketplaces, job placement, admissions). To address this need, we propose a general LTR framework that can optimize a wide range of utility metrics (e.g. NDCG) while satisfying fairness of exposure constraints with respect to the items. This framework expands the class of learnable ranking functions to stochastic ranking policies, which provides a language for rigorously expressing fairness specifications. Furthermore, we provide a new LTR algorithm called Fair-PG-Rank for directly searching the space of fair ranking policies via a policy-gradient approach. Beyond the theoretical evidence in deriving the framework and the algorithm, we provide empirical results on simulated and real-world datasets verifying the effectiveness of the approach in individual and group-fairness settings.

研究动机与目标

引入一个在曝光公平性约束下学习排序策略的框架。
允许在排序中进行显式的、基于 merit 的曝光分配。
开发一个可行的策略梯度算法（Fair-PG-Rank）以同时优化效用和公平性。
在学习过程中显示出能够检测和缓解偏差的实证证据。
在合成数据集和真实世界数据集上，同时实现对个人和群体公平性的有效性。

提出的方法

将公平LTR表述为对具有曝光基础公平约束的随机排序策略的ERM。
定义曝光、位置偏差和基于 merit 的曝光成比例约束。
采用 Ranking 公平性框架（Fairness of Exposure for Rankings）来建模个人和群体公平性的差异。
通过 Plackett-Luce 模型并结合可微分的打分函数来实现排序策略。
推导策略梯度（REINFORCE）更新，以优化效用和不平等项。
结合基线和熵正则化实现方差降低以稳定学习。

实验结果

研究问题

RQ1PG-Rank 是否能够学习在满足公平性约束的前提下最大化用户效用的排序策略？
RQ2Fair-PG-Rank 策略是否能通过调整 lambda 在 NDCG 和曝光公平性之间进行有效权衡，在合成数据和真实世界数据上表现良好？
RQ3该方法是否能够在学习过程中识别并中和有偏的特征？
RQ4Fair-PG-Rank 在个人层面和群体层面的公平性设置下表现如何？

主要发现

Method	NDCG@10	ERR
RankSVM Joachims (2006)	0.75924	0.43680
GBDT Ye et al. (2009)	0.79013	0.46201
PG-Rank (Linear model)	0.76145	0.44988
PG-Rank (Neural Network)	0.77082	0.45440

PG-Rank 在 Yahoo! 数据上实现了与基线 LTR 方法相比具有竞争力的 NDCG 和 ERR。
Fair-PG-Rank 通过调整 lambda 可以在效用和公平性之间进行权衡，在保持高 NDCG 的同时降低差异性。
在合成实验中，Fair-PG-Rank 学会降低对有偏特征的权重以改善群体公平性，同时不牺牲效用。
在德国语币数据集上，Fair-PG-Rank 在多次运行中能够在 NDCG 与群体公平性差异之间取得有效平衡。
该方法展示了在学习过程中识别并缓解偏倚属性的能力，这一点不同于某些后处理基线方法。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。