QUICK REVIEW

[論文レビュー] ATRank: An Attention-Based User Behavior Modeling Framework for Recommendation

Chang Zhou, Jinze Bai|arXiv (Cornell University)|Nov 17, 2017

Recommender Systems and Techniques参考文献 25被引用数 104

ひとこと要約

ARank (ATRank) は、複数の潜在的セマンティック空間と自己注意を用いて異種のユーザー行動をモデリングする注意機構ベースのフレームワークを提案し、高速な訓練と統一的なマルチタスク予測を実現します。

ABSTRACT

A user can be represented as what he/she does along the history. A common way to deal with the user modeling problem is to manually extract all kinds of aggregated features over the heterogeneous behaviors, which may fail to fully represent the data itself due to limited human instinct. Recent works usually use RNN-based methods to give an overall embedding of a behavior sequence, which then could be exploited by the downstream applications. However, this can only preserve very limited information, or aggregated memories of a person. When a downstream application requires to facilitate the modeled user features, it may lose the integrity of the specific highly correlated behavior of the user, and introduce noises derived from unrelated behaviors. This paper proposes an attention based user behavior modeling framework called ATRank, which we mainly use for recommendation tasks. Heterogeneous user behaviors are considered in our model that we project all types of behaviors into multiple latent semantic spaces, where influence can be made among the behaviors via self-attention. Downstream applications then can use the user behavior vectors via vanilla attention. Experiments show that ATRank can achieve better performance and faster training process. We further explore ATRank to use one unified model to predict different types of user behaviors at the same time, showing a comparable performance with the highly optimized individual models.

研究の動機と目的

手作業で設計された特徴量や単一シーケンスエンコーダを超えるユーザー表現の改善を動機づける。
自己注意を用いて複数の潜在セマンティック空間で異種のユーザー行動をモデリングする。
RNN/CNNエンコーダと比較して訓練の高速化とオンライン予測を実現する。
複数の行動タイプを同時に予測できる統一モデルの探索。
実世界のAmazonおよびTaobaoデータセットでの性能評価とベースラインとの比較。

提案手法

ターゲットオブジェクトタイプ別にユーザー行動を行動グループに分割する。
各グループ内で生データの特徴を埋め込み、時間をビン化して時系列エンコードを行う。
グループ埋め込みを単純なフィードフォワード投影で複数の潜在セマンティック空間に射影する。
各セマンティック空間内で自己注意を適用し、行動内の影響をモデリングする。
空間を跨いだ自己注意の出力を結合し、非線形トランスフォーマーブロックを通過させる。
ベーシックな注意機構を用いてユーザー行動表現と下流タスク（ポイントワイズまたはペアワイズランキング）を結びつける。

実験結果

リサーチクエスチョン

RQ1異なる潜在セマンティック空間に射影し、自己注意を用いて行動間の影響を捉えることで、異種のユーザー行動を効果的にモデリングできるか。
RQ2様々な行動タイプを予測する統一マルチタスクモデルは、専門モデルと同等の性能を発揮するか。
RQ3ATRank はRNN/CNNベースのシーケンスエンコーダより訓練が速く、推薦設定で高精度か。

主な発見

ATRank は Amazon Electro における AUC が競合ベースラインより高く、0.8921 対 0.8757（Bi-LSTM）および 0.8804（CNN+Pooling）。
ATRank は Amazon Clothing における AUC が競合ベースラインより高く、0.7905 対 0.7869（Bi-LSTM）および 0.7835（Bi-LSTM+Attention）。
マルチ行動設定では、ATRank-All2One および ATRank-All2All モデルが、1対1のベースラインと比較して統一モデルで競争力のある、またはそれ以上の性能を示す。
Table 5 は all2one および all2all 構成で、ATRank が 0.8921（Electro）および 0.7905（Clothing）を達成し、列挙されたベースライン（BPR、Bi-LSTM、Bi-LSTM+Attention、CNN+Pooling）を上回っている。
自己注意操作の並列実行によりRNN/CNNベースのエンコーダより収束が速く、予測性能を維持または向上させる。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。