[論文レビュー] Sensory-Aware Sequential Recommendation via Review-Distilled Representations
ASEGR は、教師となる LLM の出力を圧縮された student encoder に蒸留して、レビューから言語的に抽出された感覚属性で順序推奨アイテム表現を豊富にし、固定の感覚埋め込みを標準モデルへ注入して次アイテム予測を改善します。
We propose a novel framework for sensory-aware sequential recommendation that enriches item representations with linguistically extracted sensory attributes from product reviews. Our approach, extsc{ASEGR} (Attribute-based Sensory Enhanced Generative Recommendation), introduces a two-stage pipeline in which a large language model is first fine-tuned as a teacher to extract structured sensory attribute--value pairs, such as extit{color: matte black} and extit{scent: vanilla}, from unstructured review text. The extracted structures are then distilled into a compact student transformer that produces fixed-dimensional sensory embeddings for each item. These embeddings encode experiential semantics in a reusable form and are incorporated into standard sequential recommender architectures as additional item-level representations. We evaluate our method on four Amazon domains and integrate the learned sensory embeddings into representative sequential recommendation models, including SASRec, BERT4Rec, and BSARec. Across domains, sensory-enhanced models consistently outperform their identifier-based counterparts, indicating that linguistically grounded sensory representations provide complementary signals to behavioral interaction patterns. Qualitative analysis further shows that the extracted attributes align closely with human perceptions of products, enabling interpretable connections between natural language descriptions and recommendation behavior. Overall, this work demonstrates that sensory attribute distillation offers a principled and scalable way to bridge information extraction and sequential recommendation through structured semantic representation learning.
研究の動機と目的
- Motivate enriching sequential recommender systems with explicit sensory semantics from reviews.
- Develop ASEGR to extract structured sensory attribute–value pairs using a teacher LLM and distill them into compact item embeddings.
- Enable seamless integration of sensory embeddings into existing sequential backbones without online LLM usage.
- Demonstrate that sensory embeddings provide complementary signals to interaction data across multiple domains and backbones.
提案手法
- Construct a fixed sensory facet schema (color, pattern, texture, scent, etc.) and extract open-vocabulary attribute values with a teacher LLM trained to produce structured JSON records.
- Fine-tune a teacher model on seed data to align with a reference sensory extraction standard and annotate a large catalog offline (2.67M items).
- Train a compact student encoder (DeBERTa v3 Small) to map item text to a 768-dimensional sensory embedding by regression to the teacher target and contrastive learning (L stu = ||f theta(xi) - zteach,i||^2 + lambda * L_NCE).
- Precompute and store sensory embeddings for all items; inject embeddings into sequential models via a unified early fusion operator at the input layer (project s_i and fuse with item ID embeddings).
- Evaluate Sensory-augmented variants across SASRec, BERT4Rec, and BSARec on four Amazon domains using full-ranking HR and NDCG metrics; maintain consistent training protocols to isolate sensory contribution.
実験結果
リサーチクエスチョン
- RQ1Can linguistically grounded, review-derived sensory attributes improve next-item prediction when integrated into standard sequential models?
- RQ2Does distilling sensory information into compact embeddings maintain performance gains across multiple backbones and domains?
- RQ3Do sensory embeddings provide interpretable and auditable signals aligned with human product perception?
- RQ4Is the sensory extraction and distillation pipeline scalable to large catalogs and offline preprocessing costs?
- RQ5How do sensory signals affect ranking behavior differently across datasets and model architectures?
主な発見
- Sensory-enhanced models outperform their ID-only counterparts across all backbones and domains, with consistent gains in HR and NDCG at multiple cutoffs.
- Beauty and Toys domains show the strongest gains, with notable improvements such as SASRec HR@10 rising from 6.05 to 7.22 and NDCG@10 from 3.18 to 4.17 in Beauty, and substantial boosts in Toys for various backbones.
- Sports domain exhibits mixed effects, with SASRec increasing HR@10 but decreasing NDCG@10, while BERT4Rec shows dual gains in both metrics.
- Across all domains and backbones, the sensory embeddings provide complementary signals that improve both retrieval and ranking, suggesting the utility of linguistically grounded sensory representations.
- The framework enables interpretability by grounding item embeddings in explicit sensory descriptors derived from reviews.
より良い研究を、今すぐ始めましょう
論文設計から論文執筆まで、研究時間を劇的に削減しましょう。
クレジットカード登録不要
このレビューはAIが作成し、人間の編集者が確認しました。