QUICK REVIEW

[Paper Review] Self-Attentive Sequential Recommendation

Wang-Cheng Kang, Julian McAuley|arXiv (Cornell University)|Aug 20, 2018

Recommender Systems and Techniques37 references84 citations

TL;DR

SASRec uses self-attention to model user action sequences for next-item recommendation, achieving strong performance and efficiency across sparse and dense datasets. It adaptively weighs past actions to predict the next item.

ABSTRACT

Sequential dynamics are a key feature of many modern recommender systems, which seek to capture the `context' of users' activities on the basis of actions they have performed recently. To capture such patterns, two approaches have proliferated: Markov Chains (MCs) and Recurrent Neural Networks (RNNs). Markov Chains assume that a user's next action can be predicted on the basis of just their last (or last few) actions, while RNNs in principle allow for longer-term semantics to be uncovered. Generally speaking, MC-based methods perform best in extremely sparse datasets, where model parsimony is critical, while RNNs perform better in denser datasets where higher model complexity is affordable. The goal of our work is to balance these two goals, by proposing a self-attention based sequential model (SASRec) that allows us to capture long-term semantics (like an RNN), but, using an attention mechanism, makes its predictions based on relatively few actions (like an MC). At each time step, SASRec seeks to identify which items are `relevant' from a user's action history, and use them to predict the next item. Extensive empirical studies show that our method outperforms various state-of-the-art sequential models (including MC/CNN/RNN-based approaches) on both sparse and dense datasets. Moreover, the model is an order of magnitude more efficient than comparable CNN/RNN-based models. Visualizations on attention weights also show how our model adaptively handles datasets with various density, and uncovers meaningful patterns in activity sequences.

Motivation & Objective

Motivate sequential recommender systems to balance long-term semantics with short-term context.
Propose a self-attention based model to selectively attend to relevant past actions.
Achieve strong predictive performance with improved efficiency over CNN/RNN-based methods.

Proposed method

Embed user action sequences with item and positional embeddings.
Apply stacked self-attention blocks with causal masking to capture dependencies among past items.
Use a feed-forward network with residual connections and layer normalization for stability and nonlinearity.
Predict next-item scores via matrix factorization style interaction between the final embeddings and item embeddings (or shared item embeddings).
Train with binary cross-entropy using negative sampling and the Adam optimizer.

Experimental results

Research questions

RQ1Does SASRec outperform state-of-the-art sequential recommender models across sparse and dense datasets?
RQ2How do components like positional embeddings, attention blocks, and shared item embeddings affect performance?
RQ3What are the training efficiency and scalability characteristics of SASRec as sequence length grows?
RQ4Can attention heads reveal meaningful patterns related to positions or item attributes?

Key findings

SASRec outperforms all baselines (including MC/CNN/RNN variants) across both sparse and dense datasets.
The model is significantly more efficient than CNN/RNN-based approaches due to parallelizable self-attention computations.
Attention visualizations reveal adaptive focus on relevant past actions, with longer-range dependencies on dense data and recent actions on sparse data.
Two self-attention blocks with learned positional embeddings yield strong performance with moderate training time.
SASRec can be interpreted as a flexible, adaptive hierarchical item similarity model for next-item recommendation.
Across datasets, SASRec achieves notable improvements over non-neural and neural baselines (specific gains summarized in reported results).

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.