QUICK REVIEW

[논문 리뷰] CoNRec: Context-Discerning Negative Recommendation with LLMs

Xinda Chen, Jiawei Wu|arXiv (Cornell University)|2026. 01. 22.

Recommender Systems and Techniques인용 수 0

한 줄 요약

CoNRec은 의미 ID와 점진적 훈련을 활용한 LLM 기반 프레임워크를 도입해 사용자의 부정적 피드백을 모델링하고 예측하며, Taobao 데이터에서 최첨단 결과를 달성한다.

ABSTRACT

Understanding what users like is relatively straightforward; understanding what users dislike, however, remains a challenging and underexplored problem. Research into users' negative preferences has gained increasing importance in modern recommendation systems. Numerous platforms have introduced explicit negative feedback mechanisms and leverage such signals to refine their recommendation models. Beyond traditional business metrics, user experience-driven metrics, such as negative feedback rates, have become critical indicators for evaluating system performance. However, most existing approaches primarily use negative feedback as an auxiliary signal to enhance positive recommendations, paying little attention to directly modeling negative interests, which can be highly valuable in offline applications. Moreover, due to the inherent sparsity of negative feedback data, models often suffer from context understanding biases induced by positive feedback dominance. To address these challenges, we propose the first large language model framework for negative feedback modeling with special designed context-discerning modules. We use semantic ID Representation to replace text-based item descriptions and introduce an item-level alignment task that enhances the LLM's understanding of the semantic context behind negative feedback. Furthermore, we design a Progressive GRPO training paradigm that enables the model to dynamically balance the positive and negative behavioral context utilization. Besides, our investigation further reveals a fundamental misalignment between the conventional next-negative-item prediction objective and users' true negative preferences, which is heavily influenced by the system's recommendation order. To mitigate this, we propose a novel reward function and evaluation metric grounded in multi-day future negative feedback and their collaborative signals.

연구 동기 및 목표

권장 시스템에서 부정적 피드백의 부족과 불일치를 동기부여하고 해결한다.
LLM과 의미 ID를 사용해 사용자의 부정적 관심을 정확히 모델링하는 프레임워크를 개발한다.
양의 피드백 지배와 다음 아이템 예측 목표로 인한 편향을 완화한다.
산업 현장 배치에 적합한 오프라인형 확장 가능한 부정적 피드백 필터링을 제공한다.

제안 방법

다중 모달 인코딩과 잔차 양자화 VAE를 통해 아이템 정보를 압축하고 Semantic IDs로 아이템을 표현한다.
긴 사용자 이력 없이 부정 피드백 의미에 맞추기 위해 LoRA 기반 미세조정을 통한 아이템 수준 정렬 작업을 추가한다.
맥락(context), 정렬(alignment), 보상(rewards) 중 어떤 메커니즘이 부정 피드백을 이끄는 미묘한 요인을 가장 잘 포착하는가?
실제 사용자의 싫어함과 일치하도록 미래 부정 피드백 신호와 공동적 부정 피드백 신호를 학습 목표에 확장한다.
실세계 배치에서 코드북으로 임베딩을 재구성해 지나치게 유사하거나 선호되지 않는 아이템을 필터링하는 오프라인 방식으로 작동한다.

Figure 1: User Negative-Interest Modeling (icon generated by Doubao): For a user who dislikes bulky footwear and wired audio (A, C, E in bold), rule-based methods lead to over-suppression (red box represents wrong results) while traditional models perform poorly on cold-start items like bulky slippe

실험 결과

연구 질문

RQ1역상(부정적) 긍정 신호를 넘어 사용자의 부정 선호를 모델링하기 위해 LLM을 어떻게 적용할 수 있는가?
RQ2맥락(context), 정렬(alignment), 보상(rewards) 중 어떤 메커니즘이 부정 피드백을 이끄는 미묘한 요인을 가장 잘 포착하는가?
RQ3부정 피드백 모델링이 긍정 상호작용의 우세와 데이터 희소성에 대해 견고할 수 있는가?
RQ4미래 신호와 협력 신호가 부정 아이템 예측 개선에 얼마나 효과적인가?

주요 결과

모델	HR@20	FHR@20	LUF@20	LIF@20	후보자 정확도
Caser	0.0098	0.0128	0.0085	0.0135	N/A
SASRec	0.0180	0.0262	0.0169	0.0280	N/A
BERT4Rec	0.0186	0.0260	0.0173	0.0311	N/A
FDSA	0.0284	0.0374	0.0232	0.0362	N/A
S 3 -Rec	0.0268	0.0329	0.0206	0.0382	N/A
P5-CID	0.0262	0.0381	0.0220	0.0356	N/A
TIGER	0.0264	0.0388	0.0232	0.0360	N/A
TALLRec	N/A	N/A	N/A	N/A	0.2686
InstructRec	N/A	N/A	N/A	N/A	0.3453
LC-Rec (Neg.&Pos.)	0.0159	0.0381	0.0199	0.0351	0.1333
LC-Rec (Neg. Only)	0.0296	0.0385	0.0258	0.0397	0.2892
CoNRec	0.0330	0.0441	0.0297	0.0496	0.6950
Improv.	+11.5%	+13.7%	+15.1%	+24.9%	+101.3%

CoNRec은 Taobao에서 최첨단 성능을 달성하며 HR@20 및 FHR@20에서 베이스라인 대비 주목할 만한 이득을 보인다.
아이템 수준 정렬과 점진적 맥락 통합을 도입하면 생성적·식별적 지표 모두에서 의미 있는 차이로 향상된다.
참고 진실을 7일 창으로 확장하고 고협업 아이템을 포함시키면 상위 부정 관심사의 보유 커버리지가 크게 향상된다.
미래 부정 피드백에 기반한 보상 설계와 미래 긍정 피드백에 대한 패널티가 최상의 종합 성능을 낳는다.
CoNRec는 온라인 배치에 적합한 강력한 오프라인 필터링 능력을 보여주며 이전 방법보다 훨씬 높은 Candidate Accuracy를 달성한다.
제거/고찰 실험은 전체 모델이 여러 지표와 작업에서 베이스라인을 상회하며, 작업 간 전이 시 망각률이 더 낮음을 보여준다.

(a) Proportion of Main Negative Interest among Latest Feedback

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.