QUICK REVIEW

[論文レビュー] Self-Attentive Sequential Recommendation

Wang-Cheng Kang, Julian McAuley|arXiv (Cornell University)|Aug 20, 2018

Recommender Systems and Techniques参考文献 37被引用数 84

ひとこと要約

SASRec は自己注意機構を用いて次のアイテム推奨のためのユーザー行動シーケンスをモデル化し、スパースデータと密なデータセットの両方で強力な性能と効率を達成する。過去の行動を適応的に重み付けして次のアイテムを予測する。

ABSTRACT

Sequential dynamics are a key feature of many modern recommender systems, which seek to capture the `context' of users' activities on the basis of actions they have performed recently. To capture such patterns, two approaches have proliferated: Markov Chains (MCs) and Recurrent Neural Networks (RNNs). Markov Chains assume that a user's next action can be predicted on the basis of just their last (or last few) actions, while RNNs in principle allow for longer-term semantics to be uncovered. Generally speaking, MC-based methods perform best in extremely sparse datasets, where model parsimony is critical, while RNNs perform better in denser datasets where higher model complexity is affordable. The goal of our work is to balance these two goals, by proposing a self-attention based sequential model (SASRec) that allows us to capture long-term semantics (like an RNN), but, using an attention mechanism, makes its predictions based on relatively few actions (like an MC). At each time step, SASRec seeks to identify which items are `relevant' from a user's action history, and use them to predict the next item. Extensive empirical studies show that our method outperforms various state-of-the-art sequential models (including MC/CNN/RNN-based approaches) on both sparse and dense datasets. Moreover, the model is an order of magnitude more efficient than comparable CNN/RNN-based models. Visualizations on attention weights also show how our model adaptively handles datasets with various density, and uncovers meaningful patterns in activity sequences.

研究の動機と目的

長期的な意味論と短期的文脈のバランスを取るために、連続推奨システムを動機づける。
関連する過去の行動を選択的に参照する自己注意ベースのモデルを提案する。
CNN/RNN ベースの手法よりも効率を改善しつつ強い予測性能を達成する。

提案手法

アイテムと位置の埋め込みを用いてユーザーの行動シーケンスを埋め込む。
因果マスキングを持つ積み上げられた自己注意ブロックを適用し、過去のアイテム間の依存関係を捉える。
安定性と非線形性のために残差接続と層正規化を用いたフィードフォワードネットワークを使用する。
最終埋め込みとアイテム埋め込み（または共有アイテム埋め込み）との間のマトリクス因子化スタイルの相互作用を介して次アイテムのスコアを予測する。
ネガティブサンプリングと Adam オプティマイザを用いた二値クロスエントロピーで学習する。

実験結果

リサーチクエスチョン

RQ1SASRec はスパースデータと密データの両方で最先端の連続推奨モデルを上回るか？
RQ2位置埋め込み、注意ブロック、共有アイテム埋め込みなどの構成要素は性能にどう影響するか？
RQ3シーケンス長が成長するにつれて SASRec の学習効率とスケーラビリティ特性はどうなるか？
RQ4注意ヘッドは位置やアイテム属性に関連する意味のあるパターンを明らかにできるか？

主な発見

データセット	指標	PopRec	BPR	FMC	FPMC	TransRec	GRU4Rec	GRU4Rec+	Caser	SASRec
Beauty	Hit@10	0.4003	0.3775	0.3771	0.4310	0.4607	0.2125	0.3949	0.4264	0.4854
Beauty	NDCG@10	0.2277	0.2183	0.2477	0.2891	0.3020	0.1203	0.2556	0.2547	0.3219
Games	Hit@10	0.4724	0.4853	0.6358	0.6802	0.6838	0.2938	0.6599	0.5282	0.7410
Games	NDCG@10	0.2779	0.2875	0.4456	0.4680	0.4557	0.1837	0.4759	0.3214	0.5360
Steam	Hit@10	0.7172	0.7061	0.7731	0.7710	0.7624	0.4190	0.8018	0.7874	0.8729
Steam	NDCG@10	0.4535	0.4436	0.5193	0.5011	0.4852	0.2691	0.5595	0.5381	0.6306
ML-1M	Hit@10	0.4329	0.5781	0.6986	0.7599	0.6413	0.5581	0.7501	0.7886	0.8245
ML-1M	NDCG@10	0.2377	0.3287	0.4676	0.5176	0.3969	0.3381	0.5513	0.5538	0.5905

SASRec はスパースデータとデータが密なデータの両方で全ベースライン（MC/CNN/RNN 変種を含む）を上回る。
自己注意計算の並列化により CNN/RNN ベースのアプローチより大幅に効率的である。
注意の可視化は関連する過去の行動への適応的な焦点を示し、密データでは長距離依存、スパースデータでは直近の行動に焦点を当てる傾向がある。
二つの自己注意ブロックと学習済みの位置埋め込みにより、中程度の学習時間で強い性能を発揮する。
SASRec は次アイテム推奨のための柔軟で適応的な階層的アイテム類似性モデルとして解釈できる。
データセットを通じて、非ニューラルベースラインおよびニューラルベースラインに対して顕著な改善を達成している（具体的な利得は報告結果に概説されている）。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。