QUICK REVIEW

[论文解读] Query Rewriting for Retrieval-Augmented Large Language Models

Xinbei Ma, Yeyun Gong|arXiv (Cornell University)|May 23, 2023

Topic Modeling被引用 10

一句话总结

本文提出 Rewrite-Retrieve-Read 框架，在一个冻结检索器和LLM阅读器前增加一个可训练的查询改写器，以提升检索增强型LLM的性能，并对改写器进行强化学习微调。

ABSTRACT

Large Language Models (LLMs) play powerful, black-box readers in the retrieve-then-read pipeline, making remarkable progress in knowledge-intensive tasks. This work introduces a new framework, Rewrite-Retrieve-Read instead of the previous retrieve-then-read for the retrieval-augmented LLMs from the perspective of the query rewriting. Unlike prior studies focusing on adapting either the retriever or the reader, our approach pays attention to the adaptation of the search query itself, for there is inevitably a gap between the input text and the needed knowledge in retrieval. We first prompt an LLM to generate the query, then use a web search engine to retrieve contexts. Furthermore, to better align the query to the frozen modules, we propose a trainable scheme for our pipeline. A small language model is adopted as a trainable rewriter to cater to the black-box LLM reader. The rewriter is trained using the feedback of the LLM reader by reinforcement learning. Evaluation is conducted on downstream tasks, open-domain QA and multiple-choice QA. Experiments results show consistent performance improvement, indicating that our framework is proven effective and scalable, and brings a new framework for retrieval-augmented LLM.

研究动机与目标

激发并解决检索增强型LLMs中输入文本与检索所需知识之间的差距。
提出一个 Rewrite-Retrieve-Read 流程，在检索前前置查询改写步骤。
引入一个可训练的改写器（基于一个小型语言模型），通过强化学习进行训练，以与冻结的阅读器和检索器对齐。
在知识密集型任务上展示该方法的有效性和可扩展性。

提出的方法

用三步定义 Rewrite-Retrieve-Read 流程：将输入改写为查询、使用网页搜索引擎检索相关上下文、并阅读以预测答案。
实现一个可训练的改写器 G_theta（从 T5-large 初始化），在伪数据上进行热身，然后通过强化学习微调，奖罚来自LLM阅读器性能的奖励。
使用少-shot 提示从LLM获取查询作为基线改写器，并与可训练改写器进行比较。
采用基于PPO的策略优化对改写器进行训练，包括一个价值网络和KL正则化以保持接近初始化。
使用开放域问答数据集（HotPotQA、AmbigNQ、PopQA）和多选问答（MMLU）进行评估，读者为ChatGPT和Vicuna-13B。
以Bing作为检索器，BM25用于文档筛选，包含基于摘要的检索变体和基于URL的检索变体。

实验结果

研究问题

RQ1增加查询改写步骤是否能够在标准的检索-然后阅读管线之外提升检索增强型LLM的性能？
RQ2通过强化学习优化的可训练改写器，与冻结的LLM改写器和直接基于提示的改写相比如何？
RQ3在使用黑盒LLM阅读器时，查询改写对开放域问答和多选问答的准确性有何影响？

主要发现

查询改写在开放域问答数据集上相对于直接或标准检索-然后阅读基线，持续提升检索增强型LLM的性能。
可训练改写器通常超越标准的检索-然后阅读设置，在若干任务上可接近或达到基于LLM的改写器的性能。
在多数据集评估中，收益因任务和阅读器而异；对于某些数据集，LLM改写器仍然更优，而可训练改写器在资源使用较低的情况下提供有竞争力的提升。
在多选问答（MMLU）上，改写方法在大多数类别中带来提升，使用 Vicuna-13B 作为阅读器时提升更明显，而非 ChatGPT。
通过强化学习训练的学习型改写器，能够比基于提示的基线更好地将查询定制化以适应冻结的检索器和阅读器。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。