QUICK REVIEW

[论文解读] Is Retriever Merely an Approximator of Reader?

Sohee Yang, Minjoon Seo|arXiv (Cornell University)|Oct 21, 2020

Topic Modeling参考文献 28被引用 26

一句话总结

本文挑战了开放域问答中检索器仅为阅读器近似、低效版本的假设。通过知识蒸馏将阅读器知识蒸馏到检索器中，显著提升了检索召回率和端到端问答准确率，尤其在 top-1 时提升明显，同时保持了效率。

ABSTRACT

The state of the art in open-domain question answering (QA) relies on an efficient retriever that drastically reduces the search space for the expensive reader. A rather overlooked question in the community is the relationship between the retriever and the reader, and in particular, if the whole purpose of the retriever is just a fast approximation for the reader. Our empirical evidence indicates that the answer is no, and that the reader and the retriever are complementary to each other even in terms of accuracy only. We make a careful conjecture that the architectural constraint of the retriever, which has been originally intended for enabling approximate search, seems to also make the model more robust in large-scale search. We then propose to distill the reader into the retriever so that the retriever absorbs the strength of the reader while keeping its own benefit. Experimental results show that our method can enhance the document recall rate as well as the end-to-end QA accuracy of off-the-shelf retrievers in open-domain QA tasks.

研究动机与目标

探究开放域问答中的检索器是否仅仅是阅读器的近似版本，还是对模型准确率有独特贡献。
通过实证评估检索器与单塔阅读器的互补作用，解决两塔检索器为追求效率而牺牲准确率的普遍假设。
提出一种知识蒸馏方法，将阅读器知识迁移至检索器，以提升其性能，同时保持速度与可扩展性。
证明改进的检索质量可显著提升端到端问答准确率，尤其在 top-1 检索时表现更优。

提出的方法

提出一种知识蒸馏框架，将单塔阅读器模型的知识迁移至两塔检索器模型。
采用基于温度的软标签蒸馏策略，使检索器学习阅读器对候选段落的置信度分数。
在检索器微调过程中应用蒸馏，温度 T=3 时实验结果表现最佳，显著提升检索性能。
使用增强后的检索器对阅读器模型进行微调，以弥合训练与推理阶段之间的输入分布差异。
采用近似最近邻（ANN）搜索实现高效推理，保留两塔架构的速度优势。
使用召回率@k 指标评估检索性能，并通过 NaturalQuestions 和 TriviaQA 数据集上的精确匹配（EM）评估端到端问答准确率。

实验结果

研究问题

RQ1检索器是否仅仅是阅读器的近似、以效率为导向的版本，还是在开放域问答中对准确率有独特贡献？
RQ2原本为速度设计的两塔检索器架构，是否也能在大规模检索中提升鲁棒性？
RQ3在不损害效率的前提下，将阅读器知识蒸馏到现成检索器中，能多大程度上提升其性能？
RQ4提升的检索召回率是否能直接转化为更好的端到端问答准确率，尤其在 top-1 段落检索中？
RQ5使用增强检索器对阅读器进行微调后，整体问答性能如何变化？输入分布偏移的影响是什么？

主要发现

检索器并非阅读器的简单近似；其提供了互补的准确率增益，可能源于对负样本更强的鲁棒性。
将阅读器知识蒸馏到检索器后，在使用 DPR-Single 时，NaturalQuestions 数据集上的 top-1 召回率提升了 1.8 个百分点（从 52.4% 提升至 54.2%）。
使用增强检索器与 DPR-Single 时，端到端问答准确率（EM）在 top-1 时提升 5.0 个百分点，从 32.3% 提升至 37.3%。
在使用增强检索器与 RAG-Token 时，TriviaQA 数据集上的 EM 得分提升 4.6 个百分点（从 44.5% 提升至 49.1%）。
若不进行阅读器微调，性能因分布偏移而下降，表明检索器与阅读器之间的对齐至关重要。
消融实验证实蒸馏至关重要——若省略蒸馏，召回率持续下降，尤其在 top-1 时更为明显。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。