QUICK REVIEW

[论文解读] The Benefits of Diversity: Combining Comparisons and Ratings for Efficient Scoring

Julien Fageot, Matthias Grossglauser|SPIRE - Sciences Po Institutional REpository|Feb 8, 2026

Ethics and Social Impacts of AI被引用 0

一句话总结

SCoRa将比较与评分以概率模型混合，以学习项分数，证明MAP具有良好性质并显示混合信号在某些情形下优于单模态方法。

ABSTRACT

Should humans be asked to evaluate entities individually or comparatively? This question has been the subject of long debates. In this work, we show that, interestingly, combining both forms of preference elicitation can outperform the focus on a single kind. More specifically, we introduce SCoRa (Scoring from Comparisons and Ratings), a unified probabilistic model that allows to learn from both signals. We prove that the MAP estimator of SCoRa is well-behaved. It verifies monotonicity and robustness guarantees. We then empirically show that SCoRa recovers accurate scores, even under model mismatch. Most interestingly, we identify a realistic setting where combining comparisons and ratings outperforms using either one alone, and when the accurate ordering of top entities is critical. Given the de facto availability of signals of multiple forms, SCoRa additionally offers a versatile foundation for preference learning.

研究动机与目标

基于推荐与对齐等应用，激发学习用户偏好/偏好评分的动机。
探讨直接评分与成对比较之间的权衡。
提出统一模型（SCoRa），通过嵌入整合两种信号。
建立理论属性（单调性、鲁棒性）以及该模型的MAP保证。
实证表明在某些情形下混合信号可优于单一信号 elicitation。

提出的方法

定义使用嵌入x、潜在β和评分阈值theta0的SCoRa模型。
将比较建模为广义Bradley–Terry (GBT) 根律f，评分用GBT根律g。
通过最小化带凸损失L的负后验对数，推导MAP估计器。
证明损失函数是强凸的，从而确保唯一的MAP解。
将SCoRa嵌入到灵活的GBT框架中，以允许多种根律并扩展嵌入。
证明MAP估计的单调性与Lipschitz鲁棒性属性。

实验结果

研究问题

RQ1在固定数据预算下，结合评分与比较是否能比单一模态更高的评分准确性？
RQ2SCoRa MAP估计的理论保证（单调性、鲁棒性）及对模型不匹配的鲁棒性如何？
RQ3嵌入和基线阈值theta0 如何影响混合信号下的评分与学习？
RQ4在实际设置中（如顶层实体重要性、主动学习），混合方法在哪些情形最具收益？

主要发现

SCoRa MAP估计器具有唯一性和强凸性，确保优化行为良好。
条件矩对齐GBT矩结构，通过Phi函数将嵌入与潜在分数联系起来。
SCoRa表现出成对单调性和对数据编辑的Lipschitz鲁棒性；鲁棒性随正则化增强。
在存在模型不匹配的情况下，采用任一模态或其混合，分数均能被准确恢复。
在顶层实体准确性和主动学习至关重要的现实情形下，结合评分与比较的混合方法优于单模态 elicitation。
该框架可容纳多种信号与嵌入，为偏好学习提供多功能基础。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。