QUICK REVIEW

[論文レビュー] Semantics-Aware Denoising: A PLM-Guided Sample Reweighting Strategy for Robust Recommendation

Xikai Yang, Yang Wang|arXiv (Cornell University)|Feb 17, 2026

Recommender Systems and Techniques被引用数 0

ひとこと要約

SAID は、ユーザーの興味とアイテム内容との間の意味的一貫性を用いてノイズの多い暗黙的フィードバックを低重み付けし、バックボーンモデルを変更せずに AUC の堅牢性を向上させる。

ABSTRACT

Implicit feedback, such as user clicks, serves as the primary data source for modern recommender systems. However, click interactions inherently contain substantial noise, including accidental clicks, clickbait-induced interactions, and exploratory browsing behaviors that do not reflect genuine user preferences. Training recommendation models with such noisy positive samples leads to degraded prediction accuracy and unreliable recommendations. In this paper, we propose SAID (Semantics-Aware Implicit Denoising), a simple yet effective framework that leverages semantic consistency between user interests and item content to identify and downweight potentially noisy interactions. Our approach constructs textual user interest profiles from historical behaviors and computes semantic similarity with target item descriptions using pre-trained language model (PLM) based text encoders. The similarity scores are then transformed into sample weights that modulate the training loss, effectively reducing the impact of semantically inconsistent clicks. Unlike existing denoising methods that require complex auxiliary networks or multi-stage training procedures, SAID only modifies the loss function while keeping the backbone recommendation model unchanged. Extensive experiments on two real-world datasets demonstrate that SAID consistently improves recommendation performance, achieving up to 2.2% relative improvement in AUC over strong baselines, with particularly notable robustness under high noise conditions.

研究の動機と目的

ノイズの多い暗黙的フィードバック（クリック）から堅牢な推奨を動機づけ、その意味的に不一致な相互作用の影響を低減する。
履歴からのユーザー興味とアイテム内容との意味的整合性を活用してノイズのあるサンプルを低重み付けする。
バックボーンモデルのアーキテクチャを変更せず、ロス関数レベルの単純なデノイジング手法を提供する。

提案手法

履歴のユーザー行動からテキストベースのユーザー興味プロファイルを構築する。
PLM ベースのエンコーダを用いてユーザープロファイルとターゲットアイテ descriptionsの意味的類似性を計算する。
類似度スコアを訓練損失を調整するサンプル重みに変換する。
推奨バックボーンを変更せず、既存の訓練目的の中で重み付けを適用する。

実験結果

リサーチクエスチョン

RQ1ユーザーの興味とアイテム内容の意味的一貫性はノイズのある暗黙的フィードバックを識別できるか？
RQ2PLM ガイド付きのサンプル再重み付けはモデルアーキテクチャを変更せずにノイズ下で推奨性能を改善できるか？
RQ3SAID は強力なベースラインと高ノイズ条件下でどう性能を示すか？

主な発見

SAID は強力なベースラインに対して一貫して推奨性能を改善する。
AUC に最大で相対 2.2% の改善を達成。
高ノイズ条件下で顕著な堅牢性を示す。
補助的なネットワークや多段階訓練を必要とせず、ロス関数のみを変更する。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。