QUICK REVIEW

[논문 리뷰] REPLUG: Retrieval-Augmented Black-Box Language Models

Weijia Shi, Sewon Min|arXiv (Cornell University)|2023. 01. 30.

Topic Modeling인용 수 68

한 줄 요약

RePlug은 언어 모델을 블랙박스로 간주하고 검색 가능한 tunable retriever를 보강하며(앙상블 포함), LM-supervised로 검색 품질을 향상시킬 수 있다; 이는 대형 LMs의 언어 모델링 및 다운스트림 태스크 성능을 개선한다.

ABSTRACT

We introduce REPLUG, a retrieval-augmented language modeling framework that treats the language model (LM) as a black box and augments it with a tuneable retrieval model. Unlike prior retrieval-augmented LMs that train language models with special cross attention mechanisms to encode the retrieved text, REPLUG simply prepends retrieved documents to the input for the frozen black-box LM. This simple design can be easily applied to any existing retrieval and language models. Furthermore, we show that the LM can be used to supervise the retrieval model, which can then find documents that help the LM make better predictions. Our experiments demonstrate that REPLUG with the tuned retriever significantly improves the performance of GPT-3 (175B) on language modeling by 6.3%, as well as the performance of Codex on five-shot MMLU by 5.1%.

연구 동기 및 목표

미세 조정이나 내부 표현 접근이 불가능한 매우 큰 블랙박스 LMs에 대한 검색 보강의 필요성을 제시한다.
LM을 수정하지 않고 예측을 개선하기 위해 검색된 문서를 앞에 붙이고 앙상블하는 플러그 앤 플레이 검색 모듈을 제안하여 LM을 수정하지 않고 예측을 향상시킨다.
LM이 제공하는 감독 신호를 사용하여 검색기를 조정하는 RePlug LSR를 도입하여 perplexity(퍼플렉시티) 감소를 목표로 한다.
언어 모델링, MMLU, open-domain QA를 포함한 다양한 블랙박스 LMs 및 태스크에서의 개선을 입증한다.

제안 방법

주어진 입력 x에 대해 코사인 유사도를 사용하는 이중 인코더 기반의 dense retriever를 사용하여 코퍼스에서 top-k 문서를 검색한다.
각 검색된 문서를 x 앞에 붙이고 각 (d, x) 쌍을 통째로 고정된 black-box LM에 독립적으로 전달하여 출력 확률을 앙상블한다.
LM 매개변수 업데이트를 늘리지 않으면서 여러 검색 문서의 예측을 결합하는 앙상블 스킴을 제안한다.
LM이 고정된 상태로 LM perplexity를 감독 신호로 사용하여 검색 가능성와 LM 주도 가능성 간의 KL 발산을 최소화함으로써 검색기를 학습시키는 RePlug LSR를 도입한다.
비동기식 데이터스토어 업데이트는 문서 임베딩을 재활용하고 retriever 업데이트와 정렬되도록 매 T 스텝마다 FAISS 인덱스를 재구성한다.
GPT-3, Codex, OPT, BLOOM를 포함한 다양한 LM 및 검색 모델에 대한 적용 가능성을 시연한다.

Figure 1: Different from previous retrieval-augmented approaches (Borgeaud et al., 2022 ) that enhance a language model with retrieval by updating the LM’s parameters, RePlug treats the language model as a black box and augments it with a frozen or tunable retriever. This black-box assumption makes

실험 결과

연구 질문

RQ1미세 조정이나 내부 접근 없이도 검색 보강 방법이 진정한 블랙박스 LMs를 개선할 수 있는가?
RQ2모든 검색된 문서를 연결하는 것보다 검색 문서를 앞에 붙이고 앙상블하는 것이 다음 토큰 예측을 더 효율적으로 개선하는가?
RQ3LM 감독(LSR)을 통해 검색 모델을 LM에 효과적으로 적응시켜 검색 품질을 더 향상시킬 수 있는가?
RQ4매우 큰 LMs의 언어 모델링과 MMLU, open-domain QA와 같은 다운스트림 태스크 모두에 검색 보강 방법이 이점을 주는가?
RQ5RePlug은 다양한 모델 계열과 규모에서 어떻게 작동하는가?

주요 결과

RePlug은 언어 모델링 및 다운스트림 태스크에서 다양한 블랙박스 LMs를 일관되게 향상시킨다.
GPT-3 175B language modeling의 성능은 RePlug으로 최대 6.3% 향상; Codex의 five-shot MMLU에서 4.5% (RePlug) 및 5.1% (RePlug LSR) 향상을 보인다.
RePlug LSR은 RePlug 단독보다 더 큰 향상을 준다(예: GPT-3 175B에서 최대 6.3% 및 표 1의 모델 간 평균 개선 4.7% 대비 7.7%).
MMLU에서 Codex + RePlug은 인문, 사회, STEM 및 기타 범주를 개선하여 Codex 대비 4.5% (RePlug) 및 5.1% (RePlug LSR) 증가를 달성한다.
open-domain QA에서 Codex + RePlug LSR은 few-shot 설정에서 Natural Questions에서 12.0% 향상을 달성하여 Atlas 등보다 앞선다.
RePlug은 다양한 모델 패밀리(GPT-2, OPT, BLOOM)에서 크기에 걸쳐 개선을 보이며, perplexity 감소를 가져오고(예: OPT-125M이 6.9% 개선), 넓은 적용 가능성을 보여준다.

Figure 2: RePlug at inference (§ 3 ). Given an input context, RePlug first retrieves a small set of relevant documents from an external corpus using a retriever (§ 3.1 Document Retrieval ). Then it prepends each document separately to the input context and ensembles output probabilities from differe

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.