Skip to main content
QUICK REVIEW

[论文解读] CoNRec: Context-Discerning Negative Recommendation with LLMs

Xinda Chen, Jiawei Wu|arXiv (Cornell University)|Jan 22, 2026
Recommender Systems and Techniques被引用 0
一句话总结

CoNRec 引入了一个基于大模型的框架,利用语义ID和渐进式训练来建模并预测用户的负面反馈,在淘宝数据上取得了最先进的结果。

ABSTRACT

Understanding what users like is relatively straightforward; understanding what users dislike, however, remains a challenging and underexplored problem. Research into users' negative preferences has gained increasing importance in modern recommendation systems. Numerous platforms have introduced explicit negative feedback mechanisms and leverage such signals to refine their recommendation models. Beyond traditional business metrics, user experience-driven metrics, such as negative feedback rates, have become critical indicators for evaluating system performance. However, most existing approaches primarily use negative feedback as an auxiliary signal to enhance positive recommendations, paying little attention to directly modeling negative interests, which can be highly valuable in offline applications. Moreover, due to the inherent sparsity of negative feedback data, models often suffer from context understanding biases induced by positive feedback dominance. To address these challenges, we propose the first large language model framework for negative feedback modeling with special designed context-discerning modules. We use semantic ID Representation to replace text-based item descriptions and introduce an item-level alignment task that enhances the LLM's understanding of the semantic context behind negative feedback. Furthermore, we design a Progressive GRPO training paradigm that enables the model to dynamically balance the positive and negative behavioral context utilization. Besides, our investigation further reveals a fundamental misalignment between the conventional next-negative-item prediction objective and users' true negative preferences, which is heavily influenced by the system's recommendation order. To mitigate this, we propose a novel reward function and evaluation metric grounded in multi-day future negative feedback and their collaborative signals.

研究动机与目标

  • 在推荐中激发对负面反馈的稀缺性和错配问题的动力并予以解决。
  • 开发一个能够利用LLMs和语义ID准确建模用户负面兴趣的框架。
  • 缓解正向反馈主导与下一个项预测目标所带来的偏差。
  • 提供离线、可扩展的负反馈过滤,适合工业部署。

提出的方法

  • 通过多模态编码和残差量化变分自编码器将物品表示为语义ID以压缩物品信息。
  • 增加一个基于 LoRA 微调的物品级对齐任务,使LLM在没有长用户历史的情况下适应负反馈语义。
  • 使用 Progressive Group Relative Policy Optimization (GRPO) 结合无偏奖励,逐步引入上下文并平衡正/负信号。
  • 将训练目标扩展为未来负反馈信号和协同负反馈信号,以与真实用户不喜欢的偏好保持一致。
  • 通过离线从码本重构嵌入,在实际部署中过滤掉过于相似或不被偏好的物品。
Figure 1: User Negative-Interest Modeling (icon generated by Doubao): For a user who dislikes bulky footwear and wired audio (A, C, E in bold), rule-based methods lead to over-suppression (red box represents wrong results) while traditional models perform poorly on cold-start items like bulky slippe
Figure 1: User Negative-Interest Modeling (icon generated by Doubao): For a user who dislikes bulky footwear and wired audio (A, C, E in bold), rule-based methods lead to over-suppression (red box represents wrong results) while traditional models perform poorly on cold-start items like bulky slippe

实验结果

研究问题

  • RQ1如何将LLM适配以建模超越逆向正向信号的用户负面偏好?
  • RQ2哪些机制(上下文、对齐和奖励)能够最佳捕捉推动负反馈的细粒度因素?
  • RQ3在正向交互主导和数据稀疏的情况下,负反馈建模是否鲁棒?
  • RQ4未来信号和协同信号在提升负面项预测方面有多大作用?

主要发现

模型HR@20FHR@20LUF@20LIF@20候选项准确性
Caser0.00980.01280.00850.0135N/A
SASRec0.01800.02620.01690.0280N/A
BERT4Rec0.01860.02600.01730.0311N/A
FDSA0.02840.03740.02320.0362N/A
S 3 -Rec0.02680.03290.02060.0382N/A
P5-CID0.02620.03810.02200.0356N/A
TIGER0.02640.03880.02320.0360N/A
TALLRecN/AN/AN/AN/A0.2686
InstructRecN/AN/AN/AN/A0.3453
LC-Rec (Neg.&Pos.)0.01590.03810.01990.03510.1333
LC-Rec (Neg. Only)0.02960.03850.02580.03970.2892
CoNRec0.03300.04410.02970.04960.6950
Improv.+11.5%+13.7%+15.1%+24.9%+101.3%
  • CoNRec 在淘宝上实现最先进的性能,相较基线在 HR@20 和 FHR@20 上有显著提升。
  • 引入物品级对齐和渐进式上下文可在生成和判别指标上带来有意义的改进。
  • 将真实标签扩展到7天窗口并包含高协同项显著提升前五个负兴趣的覆盖率。
  • 基于未来负反馈的奖励设计和对未来正反馈的惩罚实现了最佳总体表现。
  • CoNRec 展现出强大的离线过滤能力,适合在线部署,具有比以往方法显著更高的候选项准确性。
  • 消融实验表明完整模型在多项指标和任务上优于基线,任务间迁移时遗忘率较低。
(a) Proportion of Main Negative Interest among Latest Feedback
(a) Proportion of Main Negative Interest among Latest Feedback

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。