QUICK REVIEW

[논문 리뷰] UniKGQA: Unified Retrieval and Reasoning for Solving Multi-hop Question Answering Over Knowledge Graph

Jinhao Jiang, Kun Zhou|arXiv (Cornell University)|2022. 12. 02.

Topic Modeling인용 수 28

한 줄 요약

UniKGQA는 다중 호기 KGQA에서 검색과 추론을 함께 처리하는 통합 모델을 제시하며, 시맨틱 매칭 PLM과 정보 전파 모듈을 사용하고, 검색을 위한 추상 서브그래프와 공유 사전학습 태스크를 갖춘다.

ABSTRACT

Multi-hop Question Answering over Knowledge Graph~(KGQA) aims to find the answer entities that are multiple hops away from the topic entities mentioned in a natural language question on a large-scale Knowledge Graph (KG). To cope with the vast search space, existing work usually adopts a two-stage approach: it first retrieves a relatively small subgraph related to the question and then performs the reasoning on the subgraph to find the answer entities accurately. Although these two stages are highly related, previous work employs very different technical solutions for developing the retrieval and reasoning models, neglecting their relatedness in task essence. In this paper, we propose UniKGQA, a novel approach for multi-hop KGQA task, by unifying retrieval and reasoning in both model architecture and parameter learning. For model architecture, UniKGQA consists of a semantic matching module based on a pre-trained language model~(PLM) for question-relation semantic matching, and a matching information propagation module to propagate the matching information along the directed edges on KGs. For parameter learning, we design a shared pre-training task based on question-relation matching for both retrieval and reasoning models, and then propose retrieval- and reasoning-oriented fine-tuning strategies. Compared with previous studies, our approach is more unified, tightly relating the retrieval and reasoning stages. Extensive experiments on three benchmark datasets have demonstrated the effectiveness of our method on the multi-hop KGQA task. Our codes and data are publicly available at~\url{https://github.com/RUCAIBox/UniKGQA}.

연구 동기 및 목표

다중 호기 KGQA에서 분리된 검색과 추론 단계의 비효율성을 동기화하고 해결한다.
검색과 추론 간 매개변수와 신호를 공유하는 통합 아키텍처를 제안한다.
단계 간 규모 차이를 정규화하기 위해 추상 서브그래프를 도입한다.
단계 간 지식 전이를 위한 사전학습 및 미세조정 전략을 설계한다.
벤치마크 데이터셋에서의 효과를 입증하고 검색 품질 및 훈련 영향력을 분석한다.

제안 방법

질문-관계 관련성에 대한 PLM을 사용하는 시맨틱 매칭(SM)과 SM 신호를 KG 간선으로 전파하는 매칭 정보 전파(MIP) 모듈로 이루어진 이중 모듈 아키텍처.
검색을 위한 추상 서브그래프: 같은 head-관계 접두사를 가진 삼중 항의 꼬리들을 합쳐 노드 규모를 줄인다.
질문-관계 매칭이라는 공유 사전학습 태스크를 대조 학습과 함께 사용하여 질문과 관련 관계를 정렬하고, 주제와 답변 간의 최단 경로를 양성으로 정의한다.
두 단계 미세조정: 추상 서브그래프에서의 검색(RAS)은 실제 추상 노드 신호 대비 KL 발산을 사용; 검색된 서브그래프에서의 추론(RRS)은 검색 모델에서 초기화하고, 실제 꼬리 신호에 대한 KL 발산으로 미세조정.
PLM 매개변수가 공유되고 고정되거나 각 단계에서 업데이트될 수 있는 통합 최적화(QU 버전 vs QU,RU 버전 포함).

실험 결과

연구 질문

RQ1다중 호기 KGQA에서 분리된 단계 방식과 비교할 때 하나의 모델 아키텍처가 검색과 추론을 모두 개선할 수 있는가?
RQ2검색과 추론 사이에 매개변수를 공유하고 관련 정보를 전달하는 것이 전체 QA 성능을 개선하는가?
RQ3추상 서브그래프가 검색과 추론 단계의 규모 차이를 효과적으로 brid ge 하면서 정확성을 희생하지 않는가?
RQ4질문-관계 매칭으로의 사전학습과 이후의 미세조정이 두 단계 모두에서 효율적이고 효과적인 학습을 가져오는가?

주요 결과

모델	WebQSP Hits@1	WebQSP F1	CWQ Hits@1	CWQ F1	MetaQA-1 Hits@1	MetaQA-2 Hits@1	MetaQA-3 Hits@1
KV-Mem	46.7	34.5	18.4	15.7	96.2	82.7	48.9
GraftNet	66.4	60.4	36.8	32.7	97.0	94.8	77.7
PullNet	68.1	-	45.9	-	97.0	99.9	91.4
EmbedKGQA	66.6	-	-	-	97.5	98.8	94.8
NSM	68.7	62.8	47.6	42.4	97.1	99.9	98.9
TransferNet	71.4	-	48.6	-	97.5	100	100
SR+NSM	68.9	64.1	50.2	47.1	-	-	-
SR+NSM+E2E	69.5	64.1	49.3	46.3	-	-	-
UniKGQA	75.1	70.2	50.7	48.0	97.5	99.0	99.1
w QU	77.0	71.0	50.9	49.4	97.6	99.9	99.5
w QU,RU	77.2	72.2	51.2	49.0	98.0	99.9	99.9

UniKGQA는 WebQSP 및 CWQ에서 기준 모델을 능가하며, Hits@1에서 두드러진 향상을 보인다.
학습된 검색이 서브그래프 크기가 비슷한 경우 휴리스틱 방법 대비 더 높은 정답 커버리지를 보인다.
두 가지 학습 전략(사전학습 및 초기화 전이)이 모두 유익하다는 점을 확인하는 애블레이션 연구가 확인되었다.
질문에 대해서만 PLM 인코더를 업데이트하는 것이 질문과 관계를 모두 업데이트하는 것과 비슷하거나 더 우수한 성능을 보이며 효율성 이점을 제공한다.
통합 아키텍처는 검색에서 추론으로의 관련 정보 전이를 효과적으로 가능하게 하여 최종 QA 지표를 향상시킨다.
두 가지 변형(w QU 및 w QU,RU)이 서로 다른 계산적 트레이드오프를 가진 강력한 성능을 제공한다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.