QUICK REVIEW

[論文レビュー] CoNRec: Context-Discerning Negative Recommendation with LLMs

Xinda Chen, Jiawei Wu|arXiv (Cornell University)|Jan 22, 2026

Recommender Systems and Techniques被引用数 0

ひとこと要約

CoNRecはセマンティックIDと段階的訓練を用いたLLMベースのフレームワークを導入し、ユーザーのネガティブフィードバックをモデル化・予測。Taobaoデータで最先端の結果を達成。

ABSTRACT

Understanding what users like is relatively straightforward; understanding what users dislike, however, remains a challenging and underexplored problem. Research into users' negative preferences has gained increasing importance in modern recommendation systems. Numerous platforms have introduced explicit negative feedback mechanisms and leverage such signals to refine their recommendation models. Beyond traditional business metrics, user experience-driven metrics, such as negative feedback rates, have become critical indicators for evaluating system performance. However, most existing approaches primarily use negative feedback as an auxiliary signal to enhance positive recommendations, paying little attention to directly modeling negative interests, which can be highly valuable in offline applications. Moreover, due to the inherent sparsity of negative feedback data, models often suffer from context understanding biases induced by positive feedback dominance. To address these challenges, we propose the first large language model framework for negative feedback modeling with special designed context-discerning modules. We use semantic ID Representation to replace text-based item descriptions and introduce an item-level alignment task that enhances the LLM's understanding of the semantic context behind negative feedback. Furthermore, we design a Progressive GRPO training paradigm that enables the model to dynamically balance the positive and negative behavioral context utilization. Besides, our investigation further reveals a fundamental misalignment between the conventional next-negative-item prediction objective and users' true negative preferences, which is heavily influenced by the system's recommendation order. To mitigate this, we propose a novel reward function and evaluation metric grounded in multi-day future negative feedback and their collaborative signals.

研究の動機と目的

レコメンデーションにおけるネガティブフィードバックの希少性とミスマッチを動機づけ、解決する。
LLMsとセマンティックIDを用いてユーザーのネガティブ関心を正確にモデル化するフレームワークを開発する。
正のフィードバック支配と次アイテム予測目的から生じるバイアスを緩和する。
産業用導入に適したオフラインでスケーラブルなネガティブフィードバックフィルタリングを提供する。

提案手法

マルチモーダルエンコードと残差量子化VAEを用いてアイテムをセマンティックIDで表現し、アイテム情報を圧縮する。
LoRAベースのファインチューニングによるアイテムレベルの整合タスクを追加して、長いユーザ履歴なしでもネガティブフィードバックの意味へLLMを適応させる。
文脈を段階的に取り入れ、正/負信号のバランスをとるために無偏見報酬を用いるProgressive Group Relative Policy Optimization (GRPO) を導入する。
将来のネガティブフィードバック信号と協調的信号を訓練目的に拡張して、ユーザーの本当の嫌いに整合させる。
コードブックから埋め込みを再構成して、現実世界の導入時に過度に類似するまたは忌避されるアイテムをオフラインでフィルタリングする。

Figure 1: User Negative-Interest Modeling (icon generated by Doubao): For a user who dislikes bulky footwear and wired audio (A, C, E in bold), rule-based methods lead to over-suppression (red box represents wrong results) while traditional models perform poorly on cold-start items like bulky slippe

実験結果

リサーチクエスチョン

RQ1LLMをどのように適応させて、反転した正の信号を超えるユーザーのネガティブ嗜好をモデル化できるか。
RQ2どのメカニズム（文脈、整合、報酬）がネガティブフィードバックを駆動する微妙な要因を最もよく捉えるか。
RQ3ネガティブフィードバックのモデリングは正の相互作用の優勢とデータスパース性に頑健か。
RQ4将来信号と協調的信号はネガティブアイテム予測を改善するうえでどれほど効果的か。

主な発見

Model	HR@20	FHR@20	LUF@20	LIF@20	Candidate Acc.
Caser	0.0098	0.0128	0.0085	0.0135	N/A
SASRec	0.0180	0.0262	0.0169	0.0280	N/A
BERT4Rec	0.0186	0.0260	0.0173	0.0311	N/A
FDSA	0.0284	0.0374	0.0232	0.0362	N/A
S 3 -Rec	0.0268	0.0329	0.0206	0.0382	N/A
P5-CID	0.0262	0.0381	0.0220	0.0356	N/A
TIGER	0.0264	0.0388	0.0232	0.0360	N/A
TALLRec	N/A	N/A	N/A	N/A	0.2686
InstructRec	N/A	N/A	N/A	N/A	0.3453
LC-Rec (Neg.&Pos.)	0.0159	0.0381	0.0199	0.0351	0.1333
LC-Rec (Neg. Only)	0.0296	0.0385	0.0258	0.0397	0.2892
CoNRec	0.0330	0.0441	0.0297	0.0496	0.6950
Improv.	+11.5%	+13.7%	+15.1%	+24.9%	+101.3%

CoNRecはTaobaoで最先端の性能を達成し、HR@20およびFHR@20でベースラインを著しく上回る。
アイテムレベルの整合と段階的文脈の導入により、生成的指標と識別的指標の両方が有意なマージンで改善。
ground truthを7日間ウィンドウへ拡張し、高協調アイテムを含めることで上位ネガティブ関心のカバー率が大幅に向上。
将来のネガティブフィードバックと将来の正フィードバックへのペナルティを基にした報酬設計が全体的な性能を最も高める。
CoNRecはオンライン展開に適した強力なオフラインフィルタリング能力を示し、従来手法よりはるかに高い候補適合率を達成。
アブレーションは、全体のモデルが複数の指標とタスクでベースラインを上回り、タスク間転移時の忘却率が低いことを示す。

(a) Proportion of Main Negative Interest among Latest Feedback

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。