QUICK REVIEW

[论文解读] A Dual Embedding Space Model for Document Ranking

Bhaskar Mitra, Eric Nalisnick|arXiv (Cornell University)|Feb 2, 2016

Topic Modeling参考文献 52被引用 103

一句话总结

DESM 使用来自 word2vec 的 IN/OUT 双嵌入空间来通过聚合余弦相似度对查询-文档相关性进行评分；在使用 BM25 进行再排序时可提升前置排名，但在较大的候选集上可能产生虚假正例，这通过 BM25-DESM 混合来缓解。

ABSTRACT

A fundamental goal of search engines is to identify, given a query, documents that have relevant text. This is intrinsically difficult because the query and the document may use different vocabulary, or the document may contain query words without being relevant. We investigate neural word embeddings as a source of evidence in document ranking. We train a word2vec embedding model on a large unlabelled query corpus, but in contrast to how the model is commonly used, we retain both the input and the output projections, allowing us to leverage both the embedding spaces to derive richer distributional relationships. During ranking we map the query words into the input space and the document words into the output space, and compute a query-document relevance score by aggregating the cosine similarities across all the query-document word pairs. We postulate that the proposed Dual Embedding Space Model (DESM) captures evidence on whether a document is about a query term in addition to what is modelled by traditional term-frequency based approaches. Our experiments show that the DESM can re-rank top documents returned by a commercial Web search engine, like Bing, better than a term-matching based signal like TF-IDF. However, when ranking a larger set of candidate documents, we find the embeddings-based approach is prone to false positives, retrieving documents that are only loosely related to the query. We demonstrate that this problem can be solved effectively by ranking based on a linear mixture of the DESM and the word counting features.

研究动机与目标

提升对查询-文档关于性建模超越术语匹配的动机。
提出一种双嵌入空间表示（IN 为查询、OUT 为文档）以捕捉关于性的分布证据。
定义一个 DESM 排序特征，通过对查询词与文档中心向量的余弦相似度取平均来表示。
在大规模网页检索数据中评估 DESM 作为再排序信号以及与 BM25 的组合。
分析 DESM 的优点和局限，包括易产生虚假正例以及在 telescope 与非 telescope 设置下的表现差异。

提出的方法

在大规模未标注的查询语料上训练一个 word2vec CBOW 模型以获得 IN 和 OUT 两组嵌入。
将 DESM 分数定义为每个查询词（IN）与文档中心向量（DOC 中的向量归一化）的余弦相似度的平均值（DESM_IN-OUT）。
定义一个变体 DESM_IN-IN，使用 IN 嵌入同时表示查询和文档词。
预计算文档中心以在查询时实现高效排序。
通过线性混合将 DESM 与 BM25 结合，MM(Q,D)=alpha*DESM(Q,D)+(1-alpha)*BM25(Q,D)，并在保留集上调优 alpha。
在 telescope 与 non-telescope 评测设置下，将 DESM 变体和混合方案与 BM25 和 LSA 基线进行比较。

实验结果

研究问题

RQ1双嵌入空间（查询用 IN、文档用 OUT/IN）是否比传统术语匹配更能捕捉查询-文档的关于性？
RQ2DESM 作为独立排序信号与与 BM25 组合时的表现有何不同？
RQ3DESM 特征对无关或具有欺骗性的术语填充是否鲁棒，在何种评测设置下表现出色或失效？
RQ4在查询语料库 vs. 文档语料库上训练嵌入是否会影响 DESM 的效果？

主要发现

DESM 在对商业搜索引擎的前几名文档进行再排序的 telescope 评测中可超越 TF-IDF 类信号。
DESM_IN-OUT（在查询语料上训练）的对于关于性的信号比 DESM_IN-IN 与 BM25 在显式和隐式测试集中更强。
DESM 独立表现当候选集较大时会因为虚假正例而下降；与 BM25 的混合可缓解这一点，并实现最佳的非 telescope 的 NDCG。
在 DESM 中，使用查询数据训练的嵌入优于使用文档正文文本训练的嵌入。
DESM 特征在前几名的区分力最强，但需要与传统特征结合以实现大规模排序的鲁棒性。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。