QUICK REVIEW

[논문 리뷰] Semantics-Aware Denoising: A PLM-Guided Sample Reweighting Strategy for Robust Recommendation

Xikai Yang, Yang Wang|arXiv (Cornell University)|2026. 02. 17.

Recommender Systems and Techniques인용 수 0

한 줄 요약

SAID는 사용자 관심사와 아이템 콘텐츠 사이의 의미적 일관성을 활용하여 noisy한 암묵적 피드백의 비중을 낮추고, 백본 모델을 바꾸지 않으면서 AUC의 로버스트성을 향상시킵니다.

ABSTRACT

Implicit feedback, such as user clicks, serves as the primary data source for modern recommender systems. However, click interactions inherently contain substantial noise, including accidental clicks, clickbait-induced interactions, and exploratory browsing behaviors that do not reflect genuine user preferences. Training recommendation models with such noisy positive samples leads to degraded prediction accuracy and unreliable recommendations. In this paper, we propose SAID (Semantics-Aware Implicit Denoising), a simple yet effective framework that leverages semantic consistency between user interests and item content to identify and downweight potentially noisy interactions. Our approach constructs textual user interest profiles from historical behaviors and computes semantic similarity with target item descriptions using pre-trained language model (PLM) based text encoders. The similarity scores are then transformed into sample weights that modulate the training loss, effectively reducing the impact of semantically inconsistent clicks. Unlike existing denoising methods that require complex auxiliary networks or multi-stage training procedures, SAID only modifies the loss function while keeping the backbone recommendation model unchanged. Extensive experiments on two real-world datasets demonstrate that SAID consistently improves recommendation performance, achieving up to 2.2% relative improvement in AUC over strong baselines, with particularly notable robustness under high noise conditions.

연구 동기 및 목표

소음이 많은 암묵적 피드백(클릭)으로부터 강건한 추천을 동기 부여하기 위해 의미적으로 불일치한 상호작용의 영향을 줄입니다.
역사에서의 사용자 관심사와 아이템 콘텐츠 간의 의미적 정렬을 활용하여 소음 샘플의 가중치를 낮춥니다.
백본 모델 아키텍처를 바꾸지 않는 간단한 손실함수 차원의 디노이징 방법을 제공합니다.

제안 방법

과거 사용자 행동으로부터 텍스트 기반의 사용자 관심 프로필을 구성합니다.
PLM 기반 인코더를 사용하여 사용자 프로필과 대상 아이템 설명 간의 시맨틱 유사성을 계산합니다.
유사도 점수를 샘플 가중치로 변환하여 학습 손실을 조절합니다.
추천기 백본을 수정하지 않고 기존 학습 목표 내에서 가중치를 적용합니다.

실험 결과

연구 질문

RQ1사용자 관심사와 아이템 콘텐츠 간의 의미적 일관성이 소음이 있는 암묵적 피드백을 식별할 수 있나요?
RQ2PLM-가이드 샘플 재가중이 모델 아키텍처를 변경하지 않고 노이즈하에서 추천 성능을 향상시키나요?
RQ3SAID는 강력한 베이스라인에 비해 얼마나 잘 작동하고, 고노이즈 조건에서의 성능은 어떠한가요?

주요 결과

SAID는 강한 베이스라인에 비해 일관되게 추천 성능을 향상시킵니다.
최대 2.2%의 상대적 AUC 향상을 달성합니다.
높은 노이즈 조건에서 뚜렷한 강건성을 보입니다.
보조 네트워크나 다단계 학습이 필요 없고 손실 함수만 수정됩니다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.