QUICK REVIEW

[論文レビュー] M6-Rec: Generative Pretrained Language Models are Open-Ended Recommender Systems

Zeyu Cui, Jianxin Ma|arXiv (Cornell University)|May 17, 2022

Topic Modeling被引用数 24

ひとこと要約

本論文は、M6-Recを提案する。オープンエンドなリコメンダー系の基盤モデルで、タスクを言語理解/生成へ変換し、クラウドとエッジ機器の両方に対応した効率的な適応と展開技術を導入する。

ABSTRACT

Industrial recommender systems have been growing increasingly complex, may involve \emph{diverse domains} such as e-commerce products and user-generated contents, and can comprise \emph{a myriad of tasks} such as retrieval, ranking, explanation generation, and even AI-assisted content production. The mainstream approach so far is to develop individual algorithms for each domain and each task. In this paper, we explore the possibility of developing a unified foundation model to support \emph{open-ended domains and tasks} in an industrial recommender system, which may reduce the demand on downstream settings' data and can minimize the carbon footprint by avoiding training a separate model from scratch for every task. Deriving a unified foundation is challenging due to (i) the potentially unlimited set of downstream domains and tasks, and (ii) the real-world systems' emphasis on computational efficiency. We thus build our foundation upon M6, an existing large-scale industrial pretrained language model similar to GPT-3 and T5, and leverage M6's pretrained ability for sample-efficient downstream adaptation, by representing user behavior data as plain texts and converting the tasks to either language understanding or generation. To deal with a tight hardware budget, we propose an improved version of prompt tuning that outperforms fine-tuning with negligible 1\% task-specific parameters, and employ techniques such as late interaction, early exiting, parameter sharing, and pruning to further reduce the inference time and the model size. We demonstrate the foundation model's versatility on a wide range of tasks such as retrieval, ranking, zero-shot recommendation, explanation generation, personalized content creation, and conversational recommendation, and manage to deploy it on both cloud servers and mobile devices.

研究の動機と目的

複数のドメインとタスクにまたがる産業用リコメンダーシステムのための統一基盤モデルを動機づける。
言語モデルベースの行動データが、アイテムIDなしでサンプル効率の高い下流適応を支援できることを示す。
クラウドおよびモバイル機器への展開に有効な実用的な効率化技術（option tuning、late interaction、pruning、quantization、early exiting）を実証する。
取得、ランキング、説明生成、パーソナライズされたコンテンツ作成、対話型推薦などのゼロショット、Few-shot、および生成対応タスクを検証する。

提案手法

ユーザの行動データをプレーンテキストとして表現し、タスクを言語理解または生成へ変換することで、M6プレtrained言語モデルをM6-Recへ拡張する。
プロンプト調整の効率的な変種であるoption tuningを導入し、ほとんどのパラメータを固定したまま少数のソフトプロンプトのみを調整する。
低遅延推論のために、最初の層を事前計算するマルチセグメントLate Interactionを採用し、最後の層の相互作用を実行する。
prompt tuningとfine-tuningの橋渡しをさらに進めるため、FFN層を含むアダプタ（option-adapter tuning）を組み込む。
distillationを通じてエッジ端末向けにM6-Edgeへモデルを圧縮し、続いて剪定(pruning)・量子化(quantization)・早期終了(early exiting)を実施して可変なハードウェア予算に対応する。
取得のための128-dimベクトルとkNN検索を用いて、ユーザとアイテム表現を整合させる対比学習を用いる。
説明可能な推奨、パーソナライズされたデザイン、検索クエリ生成、対話設定の生成形式を定義する。

実験結果

リサーチクエスチョン

RQ1単一の基盤モデルは、産業用リコメンダーシステムにおけるオープンエンドなドメインとタスクをサポートできるか？
RQ2ユーザ行動のテキスト表現は、取得、ランキング、生成、対話などのタスクで効果的なゼロショットおよびFew-shot学習を可能にするか？
RQ3プロンプトベースおよびアダプタベースのパラメータ効率的な微調整は、最小限のタスク特有パラメータで競争力のある性能を達成できるか？
RQ4遅延や精度を犠牲にせずクラウド-to-edge展開を可能にする展開戦略（late interaction、pruning、quantization、early exiting）は何か？
RQ5統一モデルは説明、パーソナライズされたデザイン、対話型推奨などの生成タスクをどの程度うまく扱えるか？

主な発見

M6-Recはゼロショット/Few-shot学習をサポートし、取得からランキング、生成、対話型推薦までのタスクで性能を発揮します。
Option tuningは、パラメータのごく一部のみを調整するだけで、標準的なファインチューニングを上回ります。特に adapters（option-adapter tuning）と組み合わせた場合に顕著です。
マルチセグメントLate Interactionは、前方のトランスフォーマ層を事前計算しセグメント結果をキャッシュすることで低遅延推論を実現します。
M6-Edgeによるエッジ展開は、distillation後の剪定、8ビット量子化を経て、300Mから2M程度までの大幅なモデルサイズ削減を実現し、性能も競争力があります。
テキストベースの行動モデリングはオープンドメインの推奨を可能にし、アイテムIDへの依存を減らし、説明可能な推奨やパーソナライズされたコンテンツ作成などの生成タスクを可能にします。
このアプローチはクラウドサーバとエッジデバイスの両方に展開可能で、取得、CTR予測、および生成タスクで測定可能な向上を示します。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。