QUICK REVIEW

[論文レビュー] AIGQ: An End-to-End Hybrid Generative Architecture for E-commerce Query Recommendation

Jingcao Xu, Jianyun Zou|arXiv (Cornell University)|Mar 20, 2026

Recommender Systems and Techniques被引用数 0

ひとこと要約

AIGQ は、事前検索ヒントクエリ推奨のエンドツーエンド生成フレームワークを提示し、IL-SFT、IL-GRPO、および AIGQ-Direct と AIGQ-Think のハイブリッドデプロイを導入。Taobao でのオフライン・オンラインのゲインを評価。

ABSTRACT

Pre-search query recommendation, widely known as HintQ on Taobao's homepage, plays a vital role in intent capture and demand discovery, yet traditional methods suffer from shallow semantics, poor cold-start performance and low serendipity due to reliance on ID-based matching and co-click heuristics. To overcome these challenges, we propose AIGQ (AI-Generated Query architecture), the first end-to-end generative framework for HintQ scenario. AIGQ is built upon three core innovations spanning training paradigm, policy optimization and deployment architecture. First, we propose Interest-Aware List Supervised Fine-Tuning (IL-SFT), a list-level supervised learning approach that constructs training samples through session-aware behavior aggregation and interest-guided re-ranking strategy to faithfully model nuanced user intent. Accordingly, we design Interest-aware List Group Relative Policy Optimization (IL-GRPO), a novel policy gradient algorithm with a dual-component reward mechanism that jointly optimizes individual query relevance and global list properties, enhanced by a model-based reward from the online click-through rate (CTR) ranking model. To deploy under strict real-time and low-latency requirements, we further develop a hybrid offline-online architecture comprising AIGQ-Direct for nearline personalized user-to-query generation and AIGQ-Think, a reasoning-enhanced variant that produces trigger-to-query mappings to enrich interest diversity. Extensive offline evaluations and large-scale online A/B experiments on Taobao demonstrate that AIGQ consistently delivers substantial improvements in key business metrics across platform effectiveness and user engagement.

研究の動機と目的

eコマースにおける事前検索ヒントクエリ推奨（HintQ）課題の動機付け：コールドスタート、浅い意味論、遅延。
リスト全体の関連性と多様性を最適化するエンドツーエンド生成フレームワークを提案。
生成を実ユーザーの好みに合わせるトレーニングおよびポリシー最適化手法を開発。
低遅延と強いビジネスメトリクスを実現するオフライン・オンラインのハイブリッドデプロイを実証。

提案手法

高精度と広い関心カバレージを目指す2つのLLM変種、AIGQ-Direct と AIGQ-Think を導入。
セッション認識行動と興味指向ラベルで訓練する Interest-Aware List SFT（IL-SFT）を開発。
クエリレベルとシーケンスレベルの二要素報酬強化学習手法として Interest-aware List GRPO（IL-GRPO）を提案。
近線 u2q ヒント用の AIGQ-Direct と x2q トリガー-to-クエリマッピング用の AIGQ-Think のハイブリッドオフライン–オンラインデプロイを設計。
生成のためのユーザ履歴を効率的に符号化するプロンプト圧縮とアイテム-to-text代理を組み込む。

実験結果

リサーチクエスチョン

RQ1オフライン候補検索を用いずに、パーソナライズされたヒントクエリのランキングリストをエンドツーエンドで生成するにはどうすればよいか。
RQ2個々のクエリの関連性と全体リストの多様性・整合性を両立させる二レベルの強化学習目的は成立するか。
RQ3AIGQ-Think における推論（CoT）の統合は、多様性とユーザーエンゲージメントを遅延を損なわずに向上させるか。
RQ4厳密なオンライン遅延を満たしつつ、個性化とカバレージを維持できるデプロイメントアーキテクチャとは。

主な発見

Model	Cate HR@30	Sem. Sim	Query HR@30	Unique Cates
EBR	0.1998	0.5198	0.0100	8.4
Qwen3-30B-A3B	0.3054	0.5554	0.0022	5.4
Gemini 3 Pro	0.3449	0.5475	0.0017	7.0
GPT-5.1	0.3353	0.5606	0.0021	4.6
AIGQ-Direct_SFT	0.4181	0.5926	0.0428	7.1
AIGQ-Think_SFT	0.4437	0.6358	0.0549	7.7
AIGQ-Direct_IL-SFT	0.4305	0.5946	0.0442	7.5
AIGQ-Think_IL-SFT	0.4653	0.6478	0.0559	10.3
AIGQ-Direct_IL-SFT+GRPO	0.3906	0.6116	0.0634	4.7
AIGQ-Think_IL-SFT+GRPO	0.4438	0.6421	0.0614	8.0
AIGQ-Direct_IL-SFT+IL-GRPO	0.4695	0.6341	0.0679	6.7
AIGQ-Think_IL-SFT+IL-GRPO	0.4704	0.6624	0.0745	9.8

IL-SFT および IL-GRPO を備えた AIGQ 系は Taobao データで baseline を上回る（HR@K やセマンティック類似度などの指標）。
CoT 推論と階層報酬を持つ AIGQ-Think は AIGQ-Direct より多様性とカバレージが高い。
ハイブリッドデプロイメント（u2q 近線 + x2qリアルタイム）は遅延予算を満たしつつ強いビジネスメトリクスを提供。
オンラインABテストでプラットフォーム効果とユーザーエンゲージメントの substantial な改善を示す。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。