QUICK REVIEW

[论文解读] Large Language Models for Generative Recommendation: A Survey and Visionary Discussions

Lei Li, Yongfeng Zhang|arXiv (Cornell University)|Sep 3, 2023

Topic Modeling被引用 11

一句话总结

对大型语言模型如何充当单阶段生成推荐系统的综述，概述定义、ID 构建、任务表述和未来方向。

ABSTRACT

Large language models (LLM) not only have revolutionized the field of natural language processing (NLP) but also have the potential to reshape many other fields, e.g., recommender systems (RS). However, most of the related work treats an LLM as a component of the conventional recommendation pipeline (e.g., as a feature extractor), which may not be able to fully leverage the generative power of LLM. Instead of separating the recommendation process into multiple stages, such as score computation and re-ranking, this process can be simplified to one stage with LLM: directly generating recommendations from the complete pool of items. This survey reviews the progress, methods, and future directions of LLM-based generative recommendation by examining three questions: 1) What generative recommendation is, 2) Why RS should advance to generative recommendation, and 3) How to implement LLM-based generative recommendation for various RS tasks. We hope that this survey can provide the context and guidance needed to explore this interesting and emerging topic.

研究动机与目标

阐明为什么推荐系统应使用 LLMs 进行生成式、单阶段的推荐。
将 item/item-user IDs 泛化为与 LLMs 兼容的标记序列，以在大规模物品池上实现生成。
在 LLM 框架内系统地对常见的 RS 任务（评分、Top-N、序列化、可解释、与评价相关以及对话式）进行分类与表述。
就 ID 生成方法及实现基于 LLM 的生成式推荐的实际考量提供指南。
讨论诸如幻觉、偏见、透明度、可控性、效率及多模态扩展等方面的挑战与机遇。

提出的方法

将生成式推荐定义为直接从完整物品池中生成推荐，而非多阶段评分。
提出一个通用的用户/物品 ID 定义，将其表示为唯一标识实体的标记序列。
回顾三种 ID 生成方法（基于奇异值分解的 ID、基于产品量化的 ID，以及通过分层物品图实现的协同索引）。
给出通过任务提示和 ID 表示将 LLM 应用到常见 RS 任务的一般表述。
概述不同生成任务的评估协议及实际注意事项（评分、Top-N、序列、解释、评价、摘要以及对话场景）。

Figure 1: Pipeline comparison between traditional recommender systems and LLM-based generative recommendation.

实验结果

研究问题

RQ1什么是生成式推荐，为什么 RS 应该采用它而不是传统的判别式流水线？
RQ2如何以对 LLM 友好的方式创建能保留协同信息并扩展到真实世界物品池的 ID？
RQ3如何在基于 LLM 的生成框架内对典型的 RS 任务进行表述和求解？
RQ4推动基于 LLM 的生成式推荐的关键挑战和方向是什么？

主要发现

LLMs 有潜力用单阶段的生成过程直接输出 item IDs，从而替代多阶段过滤。
三种 ID 创建策略使得适用于 LLMs 的紧凑且唯一的物品/用户表示成为可能：基于 SVD 的 ID、基于 PQ 的 ID，以及分层协同索引。
一系列 RS 任务（评分、Top-N、序列、可解释、与评价相关以及对话式）可以被表述为提示，指导 LLM 生成 IDs 或内容。
评估方法包括用于排序的传统指标和用于自然语言输出的生成指标（BLEU/ROUGE/BERTScore），并认识到它们的局限性以及需要更好的标准。
本文讨论诸如幻觉、偏见、透明度、可控性和效率等实际问题，作为部署基于 LLM 的 RS 的核心挑战。
多模态扩展被认定为未来工作中的一个有前景的领域，用于将非文本数据整合到基于 LLM 的推荐中。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。