QUICK REVIEW

[论文解读] Contrastive Learning for Debiased Candidate Generation at Scale.

Chang Zhou, Jianxin Ma|arXiv (Cornell University)|May 20, 2020

Recommender Systems and Techniques参考文献 5被引用 3

一句话总结

CLRec 提出了一种对比学习框架，通过对比损失和固定大小的队列，利用反向倾向性评分来减少大规模候选生成中的曝光偏差。该方法在手机淘宝中部署后，在为期四个月的A/B测试中显著降低了物品曝光的马太效应。

ABSTRACT

Deep candidate generation (DCG) that narrows down the collection of relevant items from billions to hundreds via representation learning is essential to large-scale recommender systems. Standard approaches approximate maximum likelihood estimation (MLE) through sampling for better scalability and address the problem of DCG in a way similar to language modeling. However, live recommender systems face severe unfairness of exposure with a vocabulary several orders of magnitude larger than that of natural language, implying that (1) MLE will preserve and even exacerbate the exposure bias in the long run in order to faithfully fit the observed samples, and (2) suboptimal sampling and inadequate use of item features can lead to inferior representations for the unfairly ignored items. In this paper, we introduce CLRec, a Contrastive Learning paradigm that has been successfully deployed in a real-world massive recommender system, to alleviate exposure bias in DCG. We theoretically prove that a popular choice of contrastive loss is equivalently reducing the exposure bias via inverse propensity scoring, which provides a new perspective on the effectiveness of contrastive learning. We further employ a fixed-size queue to store the items' representations computed in previously processed batches, and use the queue to serve as an effective sampler of negative examples. This queue-based design provides great efficiency in incorporating rich features of the thousand negative items per batch thanks to computation reuse. Extensive offline analyses and four-month online A/B tests in Mobile Taobao demonstrate substantial improvement, including a dramatic reduction in the Matthew effect.

研究动机与目标

为解决大规模推荐系统中词汇量远超自然语言的深度候选生成（DCG）中严重的曝光偏差问题。
缓解次优采样和未充分利用的物品特征对罕见曝光物品表示学习的负面影响。
设计一种高效、可扩展的对比学习方法，在不损害训练效率的前提下，提升DCG中的公平性与表示质量。
从理论上证明对比学习通过反向倾向性评分减少曝光偏差的有效性。
通过广泛的离线分析和在手机淘宝上为期四个月的在线A/B测试验证该方法。

提出的方法

CLRec 采用对比学习范式，其对比损失函数被证明等价于通过反向倾向性评分最小化曝光偏差。
固定大小的队列存储先前处理批次中的物品表示，作为动态且高效的负样本来源。
队列支持计算复用，使每批次可对多达一千个负样本进行丰富特征整合，且开销极小。
该方法利用表示学习为正样本和负样本生成高质量嵌入，提升候选评分的公平性。
该框架设计具备可扩展性，支持手机淘宝等生产规模系统中的实时训练与推理。
对比目标促使正样本在嵌入空间中更接近，同时将负样本表示相互推开。

实验结果

研究问题

RQ1如何有效应用对比学习以减少大规模候选生成中海量物品词汇下的曝光偏差？
RQ2对比学习与反向倾向性评分在缓解曝光偏差方面存在何种理论关系？
RQ3如何利用固定大小的队列在不降低训练性能的前提下，高效且有效地在大规模场景下采样负样本？
RQ4与基于标准MLE的DCG方法相比，CLRec在多大程度上减少了物品曝光的马太效应？
RQ5所提出的方法是否能在保持计算效率的同时实现在线推荐性能的可测量提升？

主要发现

CLRec中使用的对比损失在理论上等价于应用反向倾向性评分，为该方法在减少曝光偏差方面的有效性提供了理论依据。
基于队列的负样本采样机制实现了高效的计算复用，使每批次可对多达一千个负样本进行丰富特征整合，且开销极小。
离线分析表明，CLRec提升了罕见曝光物品的表示质量，减少了不同物品间嵌入保真度的差异。
在手机淘宝上为期四个月的在线A/B测试表明，马太效应显著降低，表明物品曝光分布更加公平。
CLRec在推荐性能上实现了可测量的提升，证实了其在真实部署环境中的有效性。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。