QUICK REVIEW

[論文レビュー] KG-FiD: Infusing Knowledge Graph in Fusion-in-Decoder for Open-Domain Question Answering

Donghan Yu, Chenguang Zhu|arXiv (Cornell University)|Oct 8, 2021

Topic Modeling被引用数 26

ひとこと要約

KG-FiD は knowledge graph と2段階の graph neural network 再ランク付けを用いて passage を絞り込み、FiD の ~40% の計算量で同等またはそれ以上の QA 精度を達成する。

ABSTRACT

Current Open-Domain Question Answering (ODQA) model paradigm often contains a retrieving module and a reading module. Given an input question, the reading module predicts the answer from the relevant passages which are retrieved by the retriever. The recent proposed Fusion-in-Decoder (FiD), which is built on top of the pretrained generative model T5, achieves the state-of-the-art performance in the reading module. Although being effective, it remains constrained by inefficient attention on all retrieved passages which contain a lot of noise. In this work, we propose a novel method KG-FiD, which filters noisy passages by leveraging the structural relationship among the retrieved passages with a knowledge graph. We initiate the passage node embedding from the FiD encoder and then use graph neural network (GNN) to update the representation for reranking. To improve the efficiency, we build the GNN on top of the intermediate layer output of the FiD encoder and only pass a few top reranked passages into the higher layers of encoder and decoder for answer generation. We also apply the proposed GNN based reranking method to enhance the passage retrieval results in the retrieving module. Extensive experiments on common ODQA benchmark datasets (Natural Question and TriviaQA) demonstrate that KG-FiD can improve vanilla FiD by up to 1.5% on answer exact match score and achieve comparable performance with FiD with only 40% of computation cost.

研究の動機と目的

retrieved passages の構造的関係を活用して open-domain QA を改善する動機づけ。
passage 間の関係をモデル化し再ランキングを inform する外部知識グラフを活用する。
FiD の計算コストを削減しつつ QA パフォーマンスを損なわない。
ODQA パイプラインのリーディングモジュールの passage 選択をより効率的にする。

提案手法

passage を KG-aligned なエンティティへマッピングし、エンティティが KG で連結されている passages をリンクして passage グラフを構築する。
Stage-1 再ランク付け: 初期ノード埋め込みを DPR から得た状態で passage グラフ上の GNN を用いて大規模な取得 passage を再ランク付けする。
Stage-2 再ランク付けと回答生成: reader-encoder に由来する埋め込みを用いて小規模な passage 集を共同再ランク付けし、FiD reader で回答生成を行う。
Stage-2 では intermediate encoder 表現を用いて、残りの encoder 層を top-N2 passage のみ処理することで計算を削減する。
再ランク付けステージを回答生成 loss と jointly 学習させ、取得品質と回答精度のバランスを取る。
NQ と TriviaQA 上で FiD に対する FLOPs とレイテンシの相対的な効率向上を定量化して実証する。

実験結果

リサーチクエスチョン

RQ1 knowledge graph を用いて ODQA で retrieved passages の構造的関係をどのように捉えることができるか？
RQ2 グラフニューラルネットワークは passage を効果的に再ランク付けして回答精度を改善し、計算を削減できるか？
RQ3 Stage-2 再ランク付けに intermediate encoder 表現を用いてコストを下げても QA パフォーマンスを維持できるか？
RQ4 KG-FiD は FiD と比較して passage 数、エンコーダ層数、全体的な効率性のトレードオフはどうなるか？

主な発見

モデル	#パラメータ	NQ 正解率	TriviaQA 正解率
T5	11B	36.6	-
GPT-3 (few-shot)	175B	29.9	-
RIDER	626M	48.3	-
RECONSIDER	670M	45.5	61.7
Graph-Retriever	110M	34.7	55.8
Path-Retriever	445M	31.7	-
KAQA	110M	-	66.6
UniK-QA ⋆	990M	54.0 ⋆	64.1 ⋆
REALM	330M	40.4	-
RAG	626M	44.5	56.1
Joint Top-K	440M	49.2	64.8
FiD (base)	440M	48.2	65.0
FiD (large)	990M	51.4	67.6
KG-FiD (base)	443M	49.6	66.7
KG-FiD (large)	994M	53.4	69.8

KG-FiD は Natural Questions と TriviaQA で FiD と同等またはそれ以上の exact-match スコアを達成する。
KG ベースの2段階再ランク付けにより、大規模モデル設定で FiD の約 38-40% の計算コストに削減しつつ性能を維持または向上させる。
Stage-1 再ランク付けは initial ノード特徴として DPR 埋め込みを用いる場合に gold passage の recall を改善する。
Stage-2 再ランク付けは reader 由来の埋め込みを用い、追加の利得を提供し、組み合わせで最先端または競争力のある結果を生む。
中間層の使用レイヤ数を表す L1 と、reader での passage 数を表す N1/N2 の設定で効率向上を調整可能で、精度と速度の有利なトレードオフを実現する。
アブレーション研究では Stage-1 と Stage-2 の両方の再ランク付けが性能向上に寄与する。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。