QUICK REVIEW

[論文レビュー] How Well Does Generative Recommendation Generalize?

Yijie Ding, Zitian Guo|arXiv (Cornell University)|Mar 20, 2026

Recommender Systems and Techniques被引用数 0

ひとこと要約

Generative recommendation (GR) モデルは一般化関連の事例をよりうまく扱い、アイテムIDモデルは記憶化に長ける；トークンレベルの記憶化がGRの一般化の大部分を説明し、適応的アンサンブルが全体性能を改善する。

ABSTRACT

A widely held hypothesis for why generative recommendation (GR) models outperform conventional item ID-based models is that they generalize better. However, there is few systematic way to verify this hypothesis beyond a superficial comparison of overall performance. To address this gap, we categorize each data instance based on the specific capability required for a correct prediction: either memorization (reusing item transition patterns observed during training) or generalization (composing known patterns to predict unseen item transitions). Extensive experiments show that GR models perform better on instances that require generalization, whereas item ID-based models perform better when memorization is more important. To explain this divergence, we shift the analysis from the item level to the token level and show that what appears to be item-level generalization often reduces to token-level memorization for GR models. Finally, we show that the two paradigms are complementary. We propose a simple memorization-aware indicator that adaptively combines them on a per-instance basis, leading to improved overall recommendation performance.

研究の動機と目的

GR モデルが全体性能を超えて従来のアイテムIDモデルより一般化できるかを調査する。
テスト事例をアイテム遷移パターンに基づき memorization 対 generalization に分類する。
トークンレベルの遷移（プレフィックス memorization）が GR モデルの item-level generalization をどう説明するかを分析する。
複数の実世界データセットで GR とアイテムIDモデルを評価し、カテゴリ別の性能を定量化する。
事例ごとに GR とアイテムIDモデルを組み合わせる memorization-aware なアンサンブル戦略を提案する。

提案手法

memorization を訓練データで 1-hop のアイテム遷移 [i_{t-1} -> i_t] が観測されることと定義する。
generalization を 1-hop および multi-hop 成長カテゴリ（推移性、対称性、2nd-order 対称性、代替性）で定義する。
2 モデルをベンチマークする：TIGER（意味的 ID を用いる GR）と SASRec（アイテムID ベース）。
テストデータを memorization、generalization、_uncategorized_ のサブセットに分割し、それぞれの性能を比較する。
トークンレベルの prefix-n-gram memorization フレームワークを導入し、トークン memorization を介して item-level generalization を説明する。
prefix n-gram のカウントと意味的ID構成を用いて、トークンレベル memorization が generalization とどのように相関するかを調査する。
MSP ベースの memorization 指標を用いて、インスタンスごとに TIGER と SASRec の重み付けを行う適応的アンサンブルを提案する。

Figure 1 : Illustrated definitions for memorization vs. generalization. We define memorization and different sub-categories of generalization based on (1) the transition patterns observed in training data, and (2) the patterns required to infer.

実験結果

リサーチクエスチョン

RQ1GR モデルは、一般化を要するデータ事例ではアイテムIDモデルより優れているが、 memorization に基づく事例では劣るのか？
RQ2GR における item-level generalization は semantic IDs 内の token-level memorization で説明できるのか？
RQ3推移性、対称性、代替性、ヒット数（hop 数）などの異なる一般化タイプはモデル性能にどう影響するのか？
RQ4memorization 指標を活用した適応的アンサンブルは全体の推奨精度を改善できるのか？

主な発見

GR モデルは一般化関連サブセットで概ね SASRec を上回る（7 つの実世界データセット全体）。
SASRec は memorization 関連サブセットで TIGER を上回り、パラダイム間の補完的な強みを示す。
テストのほとんどの事例は memorization より generalization に依存しており、uncategorized は <10%。
GR における item-level generalization の大半は semantic IDs 内の token-level prefix memorization に還元できる。
トークン memorization 比率を高めると generalization は改善する一方、item-level memorization は希薄化する可能性がある。
MSP ベースの適応的アンサンブルでインスタンスごとに TIGER と SASRec を重み付けすると全体性能が改善される。

Figure 2 : Illustration of multi-hop generalization.

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。