QUICK REVIEW

[论文解读] Hypothesis-Conditioned Query Rewriting for Decision-Useful Retrieval

Hyo Sik Chang, ChangSun Lee|arXiv (Cornell University)|Mar 19, 2026

Topic Modeling被引用 0

一句话总结

HCQR 是一种训练-free 的预检索方法，使用工作假设生成三个定向查询，引导检索到决定-有用的证据，并在 MedQA 与 MMLU-Med 上提升相对基线的医疗问答准确性。

ABSTRACT

Retrieval-Augmented Generation (RAG) improves Large Language Models (LLMs) by grounding generation in external, non-parametric knowledge. However, when a task requires choosing among competing options, simply grounding generation in broadly relevant context is often insufficient to drive the final decision. Existing RAG methods typically rely on a single initial query, which often favors topical relevance over decision-relevant evidence, and therefore retrieves background information that can fail to discriminate among answer options. To address this issue, here we propose Hypothesis-Conditioned Query Rewriting (HCQR), a training-free pre-retrieval framework that reorients RAG from topic-oriented retrieval to evidence-oriented retrieval. HCQR first derives a lightweight working hypothesis from the input question and candidate options, and then rewrites retrieval into three targeted queries that seek evidence to: (1) support the hypothesis, (2) distinguish it from competing alternatives, and (3) verify salient clues in the question. This approach enables context retrieval that is more directly aligned with answer selection, allowing the generator to confirm or overturn the initial hypothesis based on the retrieved evidence. Experiments on MedQA and MMLU-Med show that HCQR consistently outperforms single-query RAG and re-rank/filter baselines, improving average accuracy over Simple RAG by 5.9 and 3.6 points, respectively. Code is available at https://anonymous.4open.science/r/HCQR-1C2E.

研究动机与目标

促使检索聚焦于决策导向的问答，而非仅看重主题相关性。
提出 HCQR 将工作假设转化为面向验证的检索查询。
证明基于假设的检索在医疗问答中提升决策有用性与准确性。
量化检索上下文质量如何影响下游决策结果。

提出的方法

从问题与候选答案中形成简明的工作假设。
将假设转化为三个定向查询：SUPPORT（假设证据）、DISTINCTION（与备选项的对比）、KEY FEATURES（验证干线线索）。
使用共享检索器获取三个查询的证据并在最终上下文预算内进行融合。
保持在最终生成器端对工作假设不可见，以维持检索端的锚定。
在固定的 MIRAGE 提示设置下评估以隔离检索策略的影响。
在 MedQA 与 MMLU-Med 上对比 HCQR 与 No-RAG、Simple RAG、Rewriting、HyDE、Rerank-RAG、MAIN-RAG，覆盖多种模型。

Figure 1: HCQR pipeline. A hypothesis formulator proposes a lightweight working hypothesis together with the evidence expected under that hypothesis. The query rewriter turns this state into three verification-oriented queries. The working hypothesis steers retrieval but is not shown to the generato

实验结果

研究问题

RQ1一个假设条件的预检索方案是否能在单一查询检索之上提升决策有用证据的数量与质量？
RQ2三查询检索（支持、区分、关键特征）是否更好地将检索上下文与医疗问答的答案选择对齐？
RQ3HCQR 如何影响检索证据在 Entailed、Useful 与 Not Useful 三个类别中的分布及下游准确性？

主要发现

方法	MedQA（Llama3.2 3B）	MedQA（Llama3.1 8B）	MedQA（Qwen3 4B）	MedQA（Qwen3 30B）	MedQA 平均值	MMLU-Med（Llama3.2 3B）	MMLU-Med（Llama3.1 8B）	MMLU-Med（Qwen3 4B）	MMLU-Med（Qwen3 30B）	MMLU-Med 平均值
CoT	58.0	69.0	72.8	84.1	71.0	66.3	75.0	82.9	90.2	78.6
Simple RAG	55.6	66.8	71.3	82.7	69.1	64.5	74.7	81.8	88.9	77.5
Rerank-RAG	56.8	67.3	72.1	83.3	69.9	67.1	75.8	82.3	90.1	78.8
Rewriting	58.9	67.5	73.1	83.3	70.7	67.0	74.7	82.5	88.8	78.2
HyDE	61.0	69.3	71.4	82.9	71.1	67.8	76.5	82.6	89.4	79.1
MAIN-RAG	55.9	65.8	70.9	83.6	69.0	67.0	75.7	81.2	88.2	78.0
HCQR (ours)	63.4	73.0	77.9	85.8	75.0	68.6	79.7	85.4	90.6	81.1

HCQR 在 MedQA 与 MMLU-Med 的所有基线和所有模型规模上均表现出色。
HCQR 在 MedQA 上比 Simple RAG 的平均准确性提高 5.9 点，在 MMLU-Med 上提高 3.6 点。
HCQR 在方法中产生最高的决策有用上下文比率（DUR）。
上下文有用性分析显示 HCQR 能增加 Entailed 与 Useful 上下文，同时减少 Not Useful 上下文。
消融实验表明三种改写查询均贡献于提升，其中答案-支持查询尤具影响力。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。