QUICK REVIEW

[논문 리뷰] Self-Attentive Sequential Recommendation

Wang-Cheng Kang, Julian McAuley|arXiv (Cornell University)|2018. 08. 20.

Recommender Systems and Techniques참고 문헌 37인용 수 84

한 줄 요약

SASRec은 다음 항목 추천을 위해 사용자 행동 시퀀스를 자체 주의(self-attention)로 모델링하며 sparse/dense 데이터셋에서 강력한 성능과 효율성을 달성합니다. 과거 행동을 적응적으로 가중하여 다음 항목을 예측합니다.

ABSTRACT

Sequential dynamics are a key feature of many modern recommender systems, which seek to capture the `context' of users' activities on the basis of actions they have performed recently. To capture such patterns, two approaches have proliferated: Markov Chains (MCs) and Recurrent Neural Networks (RNNs). Markov Chains assume that a user's next action can be predicted on the basis of just their last (or last few) actions, while RNNs in principle allow for longer-term semantics to be uncovered. Generally speaking, MC-based methods perform best in extremely sparse datasets, where model parsimony is critical, while RNNs perform better in denser datasets where higher model complexity is affordable. The goal of our work is to balance these two goals, by proposing a self-attention based sequential model (SASRec) that allows us to capture long-term semantics (like an RNN), but, using an attention mechanism, makes its predictions based on relatively few actions (like an MC). At each time step, SASRec seeks to identify which items are `relevant' from a user's action history, and use them to predict the next item. Extensive empirical studies show that our method outperforms various state-of-the-art sequential models (including MC/CNN/RNN-based approaches) on both sparse and dense datasets. Moreover, the model is an order of magnitude more efficient than comparable CNN/RNN-based models. Visualizations on attention weights also show how our model adaptively handles datasets with various density, and uncovers meaningful patterns in activity sequences.

연구 동기 및 목표

장기 의미와 단기 맥락의 균형을 맞추려는 순차적 추천 시스템의 동기를 부여한다.
관련된 과거 행동에 선택적으로 주의를 기울이는 self-attention 기반 모델을 제안한다.
CNN/RNN 기반 방법에 비해 향상된 효율성과 함께 강한 예측 성능을 달성한다.

제안 방법

아이템 및 위치 임베딩으로 사용자의 행동 시퀀스를 임베딩한다.
과거 아이템들 간의 의존성을 포착하기 위해 인과 마스킹이 적용된 스택드 self-attention 블록을 사용한다.
안정성과 비선형성을 위해 잔차 연결과 층 정규화를 갖춘 피드포워드 네트워크를 사용한다.
최종 임베딩과 아이템 임베딩 간의 행렬 인자화 스타일 상호작용을 통해 다음 아이템 점수를 예측한다(또는 공유 아이템 임베딩을 사용).
음수 샘플링과 Adam 최적화를 사용하여 이진 교차 엔트로피로 학습한다.

실험 결과

연구 질문

RQ1SASRec이 희소 및 밀집 데이터셋 전반에서 최첨단 순차 추천 모델을 능가하는가?
RQ2위치 임베딩, 주의 블록, 공유 아이템 임베딩과 같은 구성 요소가 성능에 어떻게 영향을 미치는가?
RQ3시퀀스 길이가 증가함에 따라 SASRec의 학습 효율성과 확장성 특징은 무엇인가?
RQ4주의 헤드가 위치나 아이템 속성에 관련된 유의미한 패턴을 드러낼 수 있는가?

주요 결과

데이터셋	지표	PopRec	BPR	FMC	FPMC	TransRec	GRU4Rec	GRU4Rec+	Caser	SASRec
Beauty	Hit@10	0.4003	0.3775	0.3771	0.4310	0.4607	0.2125	0.3949	0.4264	0.4854
Beauty	NDCG@10	0.2277	0.2183	0.2477	0.2891	0.3020	0.1203	0.2556	0.2547	0.3219
Games	Hit@10	0.4724	0.4853	0.6358	0.6802	0.6838	0.2938	0.6599	0.5282	0.7410
Games	NDCG@10	0.2779	0.2875	0.4456	0.4680	0.4557	0.1837	0.4759	0.3214	0.5360
Steam	Hit@10	0.7172	0.7061	0.7731	0.7710	0.7624	0.4190	0.8018	0.7874	0.8729
Steam	NDCG@10	0.4535	0.4436	0.5193	0.5011	0.4852	0.2691	0.5595	0.5381	0.6306
ML-1M	Hit@10	0.4329	0.5781	0.6986	0.7599	0.6413	0.5581	0.7501	0.7886	0.8245
ML-1M	NDCG@10	0.2377	0.3287	0.4676	0.5176	0.3969	0.3381	0.5513	0.5538	0.5905

SASRec은 희소 데이터와 밀집 데이터 모두에서 모든 기준선(MC/CNN/RNN 변형 포함)을 능가한다.
병렬 가능한 self-attention 연산으로 CNN/RNN 기반 접근법에 비해 모델이 훨씬 더 효율적이다.
주의 시각화는 관련된 과거 행동에 대한 적응적 초점을 밝히며, 밀집 데이터에서 더 긴 거리 의존성, 희소 데이터에서 최근 행동에 집중되는 경향을 보인다.
학습된 위치 임베딩을 가진 두 개의 self-attention 블록이 비교적 짧은 학습 시간으로 강한 성능을 낸다.
SASRec은 다음 아이템 추천을 위한 유연하고 적응적인 계층적 아이템 유사도 모델로 해석될 수 있다.
데이터셋 전반에 걸쳐 SASRec은 비신경 기반 및 신경 기반 기준선 대비 눈에 띄는 향상을 달성한다(보고된 결과에 요약된 구체적 이득).

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.