Skip to main content
QUICK REVIEW

[论文解读] Seamlessly Unifying Attributes and Items: Conversational Recommendation for Cold-Start Users

Shijun Li, Wenqiang Lei|arXiv (Cornell University)|May 23, 2020
Advanced Bandit Algorithms Research参考文献 55被引用 33
一句话总结

本文提出 ConTS,是一个 Conversational Thompson Sampling 框架,将属性询问和项推荐统一到一个臂空间,以在冷启动对话式推荐中优化探索-利用。它在多个数据集上优于最先进的 CRS 方法。

ABSTRACT

Static recommendation methods like collaborative filtering suffer from the inherent limitation of performing real-time personalization for cold-start users. Online recommendation, e.g., multi-armed bandit approach, addresses this limitation by interactively exploring user preference online and pursuing the exploration-exploitation (EE) trade-off. However, existing bandit-based methods model recommendation actions homogeneously. Specifically, they only consider the items as the arms, being incapable of handling the item attributes, which naturally provide interpretable information of user's current demands and can effectively filter out undesired items. In this work, we consider the conversational recommendation for cold-start users, where a system can both ask the attributes from and recommend items to a user interactively. This important scenario was studied in a recent work. However, it employs a hand-crafted function to decide when to ask attributes or make recommendations. Such separate modeling of attributes and items makes the effectiveness of the system highly rely on the choice of the hand-crafted function, thus introducing fragility to the system. To address this limitation, we seamlessly unify attributes and items in the same arm space and achieve their EE trade-offs automatically using the framework of Thompson Sampling. Our Conversational Thompson Sampling (ConTS) model holistically solves all questions in conversational recommendation by choosing the arm with the maximal reward to play. Extensive experiments on three benchmark datasets show that ConTS outperforms the state-of-the-art methods Conversational UCB (ConUCB) and Estimation-Action-Reflection model in both metrics of success rate and average number of conversation turns.

研究动机与目标

  • 通过交互式属性问询实现实时个性化,解决冷启动的对话式推荐。
  • 将属性和项目统一在一个臂空间中简化决策并提高鲁棒性。
  • 利用上下文 Thompson Sampling 自然平衡探索与利用。

提出的方法

  • 将属性和项视为同一臂空间中的未区分臂,并使用统一奖励来选择臂。
  • 从现有用户初始化用户嵌入,在互动过程中更新后验参数。
  • 使用上下文 Thompson Sampling 对用户嵌入进行采样,并选择具有最大奖励的臂。
  • 将一个臂的奖励定义为用户-臂亲和力与属性兼容性的组合,指导问询和推荐两种动作。
  • 使用贝叶斯个性化排序对属性和项训练离线 FM 嵌入,将所有臂放置在共享嵌入空间中。

实验结果

研究问题

  • RQ1统一的臂空间方法在冷启动用户的问询属性与推荐项之间是否能有效平衡?
  • RQ2上下文 Thompson Sampling 是否在具有属性-项统一的 CRS 中提供自然的探索-利用平衡?
  • RQ3ConTS 与现有 CRS 方法(ConUCB、EAR)在成功率和对话效率方面的比较如何?
  • RQ4基于属性的反馈对更新用户嵌入和臂奖励有何影响?

主要发现

  • ConTS 在冷启动用户的成功率和平均对话轮次方面均超过最先进的 CRS 方法 ConUCB 和 EAR。
  • 在单一臂空间中对属性和项建模简化了决策,并消除了为问询与推荐设定手工时序规则的需要。
  • 上下文 Thompson Sampling 通过后验采样与更新提供自然的探索-利用平衡。
  • 在 Yelp、LastFM 以及一个新的 Kuaishou 数据集上的实验显示了跨领域的鲁棒性。
  • 更新机制结合用户反馈和已知偏好属性,以改进臂奖励和用户嵌入。

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。