QUICK REVIEW

[论文解读] Combining Fact Extraction and Verification with Neural Semantic Matching Networks

Yixin Nie, Haonan Chen|arXiv (Cornell University)|Nov 16, 2018

Topic Modeling参考文献 23被引用 29

一句话总结

本文提出了一种统一的神经语义匹配网络（NSMN）框架，联合执行 FEVER 基准上的文档检索、句子选择和声明验证。通过利用深层语义匹配而无需中间术语表示，整合页面浏览频率、WordNet 特征以及模块间相关性得分，该模型在盲测集上取得了 64.23 的 FEVER 得分，显著优于先前方法，达到最先进性能。

ABSTRACT

The increasing concern with misinformation has stimulated research efforts on automatic fact checking. The recently-released FEVER dataset introduced a benchmark fact-verification task in which a system is asked to verify a claim using evidential sentences from Wikipedia documents. In this paper, we present a connected system consisting of three homogeneous neural semantic matching models that conduct document retrieval, sentence selection, and claim verification jointly for fact extraction and verification. For evidence retrieval (document retrieval and sentence selection), unlike traditional vector space IR models in which queries and sources are matched in some pre-designed term vector space, we develop neural models to perform deep semantic matching from raw textual input, assuming no intermediate term representation and no access to structured external knowledge bases. We also show that Pageview frequency can also help improve the performance of evidence retrieval results, that later can be matched by using our neural semantic matching network. For claim verification, unlike previous approaches that simply feed upstream retrieved evidence and the claim to a natural language inference (NLI) model, we further enhance the NLI model by providing it with internal semantic relatedness scores (hence integrating it with the evidence retrieval modules) and ontological WordNet features. Experiments on the FEVER dataset indicate that (1) our neural semantic matching method outperforms popular TF-IDF and encoder models, by significant margins on all evidence retrieval metrics, (2) the additional relatedness score and WordNet features improve the NLI model via better semantic awareness, and (3) by formalizing all three subtasks as a similar semantic matching problem and improving on all three stages, the complete model is able to achieve the state-of-the-art results on the FEVER test set.

研究动机与目标

为应对虚假信息日益增长的挑战，开发一个用于自动事实核查的端到端系统。
通过用深度神经语义匹配网络替代传统的 TF-IDF 和向量空间模型，改进证据检索与声明验证。
通过整合上游检索模块的语义相关性得分以及 WordNet 的本体特征，提升声明验证性能。
将文档检索、句子选择和声明验证三个阶段统一为使用一致神经架构的语义匹配问题。
在不依赖 Freebase 或 DBpedia 等结构化知识库的前提下，实现在 FEVER 基准上的最先进性能。

提出的方法

系统使用三个同构的神经语义匹配网络（dNSMN、sNSMN、vNSMN）分别用于文档检索、句子选择和声明验证。
dNSMN 使用原始文本输入进行文档检索，学习深层语义表示而无需词项向量化，在性能上优于 TF-IDF 和编码器模型。
页面浏览频率被用作补充信号以提升文档排序，增强检索性能。
sNSMN 采用退火采样方法，通过语义相似度将句子与声明匹配，相关性得分输入验证器。
vNSMN 将句子选择器的语义相关性得分与 WordNet 特征（如反义词、上下位词）整合到神经 NLI 模型中，以提升蕴涵与矛盾检测能力。
整个流水线采用端到端训练，各阶段共享架构与组件，以确保一致性与联合优化。

实验结果

研究问题

RQ1神经语义匹配网络是否能在无需中间术语表示的情况下，优于传统的 TF-IDF 和基于编码器的 IR 模型进行证据检索？
RQ2在事实验证任务中，引入页面浏览频率是否能提升文档检索性能？
RQ3整合上游检索模块的语义相关性得分是否能提升下游声明验证性能？
RQ4WordNet 的本体特征在多大程度上能提升自然语言蕴涵在事实验证中的鲁棒性与准确性？
RQ5统一的神经语义匹配框架是否能联合优化文档检索、句子选择与声明验证，以实现最先进结果？

主要发现

神经语义匹配网络（dNSMN）在所有证据检索指标上显著优于 TF-IDF 和编码器模型，证明了端到端深度语义匹配的优越性。
页面浏览频率的整合提供了可比且互补的判别信息，提升了文档检索性能。
引入 WordNet 特征使 'Supports' 和 'Refutes' 样例的 F1 分数分别提升约 1 分，得益于反义词、上下位词等细粒度语义关系。
将句子选择器的相关性得分引入验证器，使 'Not Enough Info' 样例的 F1 分数提升近 3 分，增强了模型在模糊情况下的可信度。
最终模型在盲测集上取得 64.23 的 FEVER 得分，比基线模型高出两倍，创下 FEVER 基准新最先进水平。
模型表现出对噪声的容忍性，随着证据过滤阈值降低，FEVER 得分略有上升，验证了退火采样在高召回率证据选择中的有效性。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。