QUICK REVIEW

[論文レビュー] A Survey on RAG Meeting LLMs: Towards Retrieval-Augmented Large Language Models

Wenqi Fan, Yujuan Ding|arXiv (Cornell University)|May 10, 2024

Natural Language Processing Techniques被引用数 14

ひとこと要約

本調査は、Large Language Models (LLMs) の文脈における Retrieval-Augmented Generation (RAG) 技術を包括的にレビューし、RA-LLMs アーキテクチャ、トレーニング戦略、およびアプリケーションに焦点を当てます。

ABSTRACT

As one of the most advanced techniques in AI, Retrieval-Augmented Generation (RAG) can offer reliable and up-to-date external knowledge, providing huge convenience for numerous tasks. Particularly in the era of AI-Generated Content (AIGC), the powerful capacity of retrieval in providing additional knowledge enables RAG to assist existing generative AI in producing high-quality outputs. Recently, Large Language Models (LLMs) have demonstrated revolutionary abilities in language understanding and generation, while still facing inherent limitations, such as hallucinations and out-of-date internal knowledge. Given the powerful abilities of RAG in providing the latest and helpful auxiliary information, Retrieval-Augmented Large Language Models (RA-LLMs) have emerged to harness external and authoritative knowledge bases, rather than solely relying on the model's internal knowledge, to augment the generation quality of LLMs. In this survey, we comprehensively review existing research studies in RA-LLMs, covering three primary technical perspectives: architectures, training strategies, and applications. As the preliminary knowledge, we briefly introduce the foundations and recent advances of LLMs. Then, to illustrate the practical significance of RAG for LLMs, we systematically review mainstream relevant work by their architectures, training strategies, and application areas, detailing specifically the challenges of each and the corresponding capabilities of RA-LLMs. Finally, to deliver deeper insights, we discuss current limitations and several promising directions for future research. Updated information about this survey can be found at https://advanced-recommender-systems.github.io/RAG-Meets-LLMs/

研究の動機と目的

RA-LLMs の文脈設定のために、LLMs と prompting の基盤と進歩を要約する。
アーキテクチャ、トレーニングパラダイム、アプリケーションによって RA-LLMs を体系的に分類し、設計上のトレードオフを浮き彫りにする。
RA-LLMs 内の retrieval、generation、augmentation の構成要素とそれらの相互依存性について論じる。
現在の制約を特定し、RA-LLMs の将来の研究方向を提案する。

提案手法

主要な RA-LLM コンポーネントである retrieval、generation、augmentation のレビューと、 pre-retrieval / post-retrieval 処理の意思決定ポイント。
スパース検索とデンス検索の比較を行い、RAG 文脈における encoder-decoder 対 decoder-only ジェネレーターを論じる。
入力層・出力層・中間層の拡張戦略と、それらが white-box 相手と black-box 相手のジェネレーターに適用可能かを分析する。
検索のデータソースを調査する（open vs. closed、Wikipedia、インターネット検索）と、それらが RA-LLM の性能に与える影響。
RA-LLMs に関連する事前学習、ファインチューニング、インコンテキスト学習、および prompting テクニックを総合分析する。
RA-LLMs が有効とされる適用分野と課題を要約する（OpenQA、知識集約タスク、AIGC）。

実験結果

リサーチクエスチョン

RQ1現在の RA-LLMs を定義するアーキテクチャとトレーニングパラダイムは何か。
RQ2検索タイプ、粒度、拡張戦略は RA-LLM の性能にどう影響するか。
RQ3RA-LLMs の主な適用領域と制約は何か、そして将来の方向性は何か。
RQ4データソースの選択（open vs. closed、インターネット検索）が RA-LLM の能力にどう影響するか。

主な発見

RA-LLMs は retrieval を generation と組み合わせて、最新かつドメインに信頼性のある知識を LLMs に提供する。
Dense vs. sparse retrieval は柔軟性・トレーニング・性能のトレードオフがあり、タスク特化でファインチューニングを行うとしばしば最高の結果を得られる。
異なる検索の粒度（ドキュメント、パッセージ、トークン、エンティティ）は、異なる効率性と知識忠実性のニーズを対象とする。
事前検索前後の強化（クエリ拡張、HyDE、R2G、PRCA、ブレンドフィルタリング）は、検索品質とロバスト性を向上させる。
入力層・出力層・中間層の拡張戦略は、white-box および black-box ジェネレーターのいずれにも retrieved knowledge の統合を可能にする。
インターネット検索ベースの retrieval は、RA-LLMs を静的コーパスの枠を超えて最新の知識と広範なカバレッジを可能にする。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。