QUICK REVIEW

[논문 리뷰] RECOMP: Improving Retrieval-Augmented LMs with Compression and Selective Augmentation

Fang-Yuan Xu, Weijia Shi|ArXiv.org|2023. 10. 06.

Topic Modeling인용 수 10

한 줄 요약

RECOMP은 검색된 문서를 간결한 텍스트 요약으로 압축하여 (또는 관련 없을 때 비어 있음) LMs 앞에 prepend하여 비용을 절감하고 언어 모델링 및 오픈 도메인 QA의 성능을 유지합니다. 작업 신호로 학습된 추출적 및 추상적 압축기를 포함하여 선택적 보강을 가능하게 합니다.

ABSTRACT

Retrieving documents and prepending them in-context at inference time improves performance of language model (LMs) on a wide range of tasks. However, these documents, often spanning hundreds of words, make inference substantially more expensive. We propose compressing the retrieved documents into textual summaries prior to in-context integration. This not only reduces the computational costs but also relieves the burden of LMs to identify relevant information in long retrieved documents. We present two compressors -- an extractive compressor which selects useful sentences from retrieved documents and an abstractive compressor which generates summaries by synthesizing information from multiple documents. Both compressors are trained to improve LMs' performance on end tasks when the generated summaries are prepended to the LMs' input, while keeping the summary concise.If the retrieved documents are irrelevant to the input or offer no additional information to LM, our compressor can return an empty string, implementing selective augmentation.We evaluate our approach on language modeling task and open domain question answering task. We achieve a compression rate of as low as 6% with minimal loss in performance for both tasks, significantly outperforming the off-the-shelf summarization models. We show that our compressors trained for one LM can transfer to other LMs on the language modeling task and provide summaries largely faithful to the retrieved documents.

연구 동기 및 목표

긴 검색 결과 문서를 앞에 붙일 때 검색-강화 LMs의 효율성 문제를 제시한다.
두 가지 압축기(추출적 및 추상적)를 제안하여 입력에 맞춘 간결하고 충실한 요약을 생성한다.
블랙박스 LM을 사용하여 최종 태스크 성능을 최적화하는 학습 스키마를 개발한다.
검색이 가치를 더하지 않을 때 빈 요약을 허용하여 선택적 보강을 가능하게 한다.
다양한 LMs 간 압축기의 전이 가능성을 보여주고 신뢰성과 증거 의존도를 분석한다.

제안 방법

압축기 c_theta와 블랙박스 LM M이 포함된 RECOMP 아키텍처를 도입한다.
입력과의 내적을 기준으로 상위 문장을 선택하는 이중 인코더 기반의 추출적 압축기를 개발하여 간결한 요약을 형성한다.
초대형 LM에서 증류된 인코더-디코더 구조의 추상적 압축기를 개발하여 쿼리 중심의 요약을 생성한다.
선정된 문장을 앞에 붙일 때 LM 성능을 최대화하기 위해 대조적 손실로 추출적 압축기를 학습한다.
최대 규모의 교사로부터의 증류를 통해 추상적 압축기를 학습하고, 최종 태스크 성능에 의해 안내된 선택적 보강을 적용한다.
QA에서는 상위 5개 문장 요약을 사용하거나 작업별로 선택하여 효율성과 효과의 균형을 맞춘다.

Figure 1: An illustration of RECOMP , which compresses retrieved documents into a texual summary before prepending it as input to a language model at inference time. The compressed summary guides the LM to generate the correct answer, while significantly reducing the computation costs required to en

실험 결과

연구 질문

RQ1검색된 문서를 효과적으로 간결한 요약으로 압축하여 앞에 붙였을 때 엔드태스크 성능을 보존하거나 향상시킬 수 있는가?
RQ2추출적 및 추상적 압축 전략이 언어 모델링과 오픈 도메인 QA에서 서로 다른 효율성/효과성 트레이드를 제공하는가?
RQ3한 LM에서 학습된 압축기가 재훈련 없이 다른 LM으로 전이될 수 있는가?
RQ4선택적 보강(빈 요약 포함)이 관련 없는 검색 정보로 인한 성능 저하를 완화하는가?
RQ5NQ, TriviaQA, HotpotQA 등 다양한 태스크에서 추상적 요약의 충실성과 포괄성은 어떤가?

주요 결과

추출적 압축기와 추상적 압축기 모두 전체 문서 앞에 비해 성능이 개선되며, 오라클 설정에서 토큰의 6% 수준으로 압축하고도 성능 손실은 미미하다.
대조적 손실로 학습된 추출적 압축기가 BM25/Contriever 기반의 기준선보다 크게 우수하며, 약 25%의 압축과 적당한 손실을 달성한다.
초대형 LM에서 증류된 추상적 압축기가 가장 높은 압축을 제공하고 일반적으로 강한 성능을 보이며, 엔드-태스크 성능에 따라 선택적 보강으로 약 3분의 1의 예시를 앞에 붙인다.
오픈 도메인 QA 결과는 아무 검색이 없을 때보다 개선되며 선택적 보강의 이점을 보이고, 다중-홉 HotpotQA에서는 추출적 방법이 추상적 방법을 능가하는 경우가 많다.
압축기 간 전이는 LM 간에 관찰되며(GPT2에서 GPT2-XL/GPT-J으로; LLaMA-13B로는 어느 정도까지), 교차 모델 적용 가능성을 시사한다.
수동적 충실성/포괄성 분석에서 GPT-3.5 요약은 일반적으로 더 충실한 편이고, 우리의 추상적 압축기는 데이터셋에 따라 더 포괄적이지만 덜 충실할 수 있다.

Figure 4: Histogram of abstractive summary length (# tokens) distribution.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.