QUICK REVIEW

[論文レビュー] Corrective Retrieval Augmented Generation

Shi-Qi Yan, Jia-Chen Gu|arXiv (Cornell University)|Jan 29, 2024

Speech and dialogue systems被引用数 23

ひとこと要約

CRAGは、軽量な取得評価モデルと是正アクションを導入し、取得の誤りや曖昧さに対処することで Retrieval-Augmented Generation のロバスト性を向上させ、ウェブ検索と取得テキストの分解→再構成による精練を補完します。

ABSTRACT

Large language models (LLMs) inevitably exhibit hallucinations since the accuracy of generated texts cannot be secured solely by the parametric knowledge they encapsulate. Although retrieval-augmented generation (RAG) is a practicable complement to LLMs, it relies heavily on the relevance of retrieved documents, raising concerns about how the model behaves if retrieval goes wrong. To this end, we propose the Corrective Retrieval Augmented Generation (CRAG) to improve the robustness of generation. Specifically, a lightweight retrieval evaluator is designed to assess the overall quality of retrieved documents for a query, returning a confidence degree based on which different knowledge retrieval actions can be triggered. Since retrieval from static and limited corpora can only return sub-optimal documents, large-scale web searches are utilized as an extension for augmenting the retrieval results. Besides, a decompose-then-recompose algorithm is designed for retrieved documents to selectively focus on key information and filter out irrelevant information in them. CRAG is plug-and-play and can be seamlessly coupled with various RAG-based approaches. Experiments on four datasets covering short- and long-form generation tasks show that CRAG can significantly improve the performance of RAG-based approaches.

研究の動機と目的

LLM出力の誤りや幻視の可能性に対処し、RAGの堅牢性を高める動機付け。
クエリに対する取得文書の品質を評価する軽量な取得評価器を提案する。
知識を補強するための分解→再構成の精練やウェブ検索を引き起こす是正アクション（Correct、Incorrect、Ambiguous）を導入する。
さまざまな分野で標準のRAGおよびSelf-RAG手法と統合可能な、プラグアンドプレー式のCRAGモジュールを開発する。
CRAGの短編・長編生成タスクへの一般化可能性を実証する。

提案手法

与えられたクエリに対して、取得された各文書の関連性をスコアリングする軽量な取得評価器（T5-largeベース）を設計する。
上限/下限の閾値でトリガーされる、3つのアクション（Correct、Incorrect、Ambiguous）を含む信頼度ベースのアクションポリシーを定義する。
Correctがトリガーされた場合、文書を知識ストリップに分解し、フィルタリングして関連部を再結合する知識の精練を実施する。
Incorrectがトリガーされた場合、取得結果を破棄し、外部知識を得るためのウェブ検索を実行する。
Ambiguousがトリガーされた場合、内部精練と外部ウェブ検索を組み合わせる。
キーワードで書き換えられたクエリを用いたウェブ検索モジュールを統合し、外部知識を取得して同じ精練手順を適用して関連コンテンツを抽出する。
CRAGがプラグアンドプレーであり、RAGおよびSelf-RAGフレームワークと互換性があることを保証する。

実験結果

リサーチクエスチョン

RQ1RAG環境において取得文書を関連性と信頼性の面でどのように評価できるか？
RQ2軽量な取得評価器は生成品質を向上させる是正アクションを効果的に誘発できるか？
RQ3ウェブ検索を介してウェブスケールの外部知識を取り入れることは、静的コーパスが機能しない場合にロバスト性を高めるか？
RQ4知識の精練とアクショントリガの影響は、短文生成と長文生成タスクでどのように異なるか？
RQ5追加の指示チューニングなしで、CRAGは異なるRAGベースのアプローチに移植可能か？

主な発見

CRAGは、標準のRAGおよびSelf-RAGと統合した場合、短編・長編生成を含む4つのデータセット全体で性能を大幅に向上させる。
CRAGは評価器の追加アノテーションを必要とせず、RAGとSelf-RAGの双方を強化できるプラグアンドプレー型モジュールとしての適応性を示す。
軽量でT5ベースの取得評価器は、特定のクエリに対する取得文書の品質を評価する際、ChatGPTベースの代替よりも優れている。
アブレーション研究は、いずれかの単一アクションまたはコア知識の利用操作を削除すると性能が低下することを示し、精練、書き換え、外部知識選択の寄与を強調する。
CRAGは取得品質の変化に対するロバスト性を向上させ、取得品質が低下するにつれてSelf-RAGよりSelf-CRAGの方がより高い回復力を示す。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。