QUICK REVIEW

[論文レビュー] Training-Induced Bias Toward LLM-Generated Content in Dense Retrieval

William Xion, Nejdl Wolfgang|arXiv (Cornell University)|Feb 11, 2026

Topic Modeling被引用数 0

ひとこと要約

要約: 本論文は、密なリトリーバがソースバイアスを内在的に持つのではなく、監視付きファインチューニング、特に MS MARCO や LLM 生成データによって生じることを示し、困惑度がこのバイアスを説明するには不十分である、ということを示しています。

ABSTRACT

Dense retrieval is a promising approach for acquiring relevant context or world knowledge in open-domain natural language processing tasks and is now widely used in information retrieval applications. However, recent reports claim a broad preference for text generated by large language models (LLMs). This bias is called "source bias", and it has been hypothesized that lower perplexity contributes to this effect. In this study, we revisit this claim by conducting a controlled evaluation to trace the emergence of such preferences across training stages and data sources. Using parallel human- and LLM-generated counterparts of the SciFact and Natural Questions (NQ320K) datasets, we compare unsupervised checkpoints with models fine-tuned using in-domain human text, in-domain LLM-generated text, and MS MARCO. Our results show the following: 1) Unsupervised retrievers do not exhibit a uniform pro-LLM preference. The direction and magnitude depend on the dataset. 2) Across the settings tested, supervised fine-tuning on MS MARCO consistently shifts the rankings toward LLM-generated text. 3) In-domain fine-tuning produces dataset-specific and inconsistent shifts in preference. 4) Fine-tuning on LLM-generated corpora induces a pronounced pro-LLM bias. Finally, a retriever-centric perplexity probe involving the reattachment of a language modeling head to the fine-tuned dense retriever encoder indicates agreement with relevance near chance, thereby weakening the explanatory power of perplexity. Our study demonstrates that source bias is a training-induced phenomenon rather than an inherent property of dense retrievers.

研究の動機と目的

unsupervised dense retrievers が本質的に LLM 生成コンテンツを好むのか、それともバイアスが訓練中に生じるのかを評価する。
異なるファインチューニングコーパス（MS MARCO、ドメイン内人手作成データ、ドメイン内 LLM-生成データ）が検索の嗜好にどのように影響するかを調べる。
困惑度ベースの説明を検証するために、リトリーバー中心の困惑度と関連性整合性を測定する。

提案手法

複数の dense retriever ファミリ（E5、Contriever、AugTriever）を、訓練段階ごとに評価する：無監督、MS MARCO ファインチューニング、ドメイン内ファインチューニング（人手作成・LLM生成）。
SciFact および NQ320K からの、人手作成パッセージと Llama2 生成パッセージを Relative Delta 指標で用いて、ソースバイアスを定量化する。
比較可能性を確保するため、固定ハイパーパラメータで標準 InfoNCE コントラスト損失を用い、4 GPU でファインチューニングを行う。
リトリーバー中心の言語モデル頭を付加して Perplexity-Relevance Agreement (PRA) を測定し、関連性信号と比較する。
訓練段階とコーパスをまたいでバイアスがどう進化するかを分析し、困惑度の説明因子としての解釈を再考する。

実験結果

リサーチクエスチョン

RQ1RQ1: 無監督の密なリトリーバは一貫した pro-LLM バイアスを示すのか、それともバイアスは主に監視付きファインチューニング時に生じるのか？
RQ2RQ2: MS MARCO、ドメイン内人手作成データ、またはドメイン内 LLM 生成データでファインチューニングした場合、検索嗜好の方向と大きさにどう影響するのか？
RQ3RQ3: 困惑度ベースの説明（リトリーバー中心の困惑度を含む）は、観測されたソースバイアスを説明できるのか？

主な発見

無監督のリトリーバはデータセット依存で一貫性のないバイアスを示し、普遍的な pro-LLM の嗜好とは限らない。
MS MARCO での監視付きファインチューニングは、設定を問わずランキングを LL M 生成テキストへ一貫してシフトさせる。
ドメイン内のファインチューニングはデータセットとモデルに依存したシフトを生み、人手寄りまたはLLM寄り、あるいは混合となり得る。
LLM生成コーパスでのファインチューニングは、データセット全体で強い pro-LLM バイアスをもたらす。
リトリーバー中心の困惑度測定は偶然と一致することが多く、困惑度をバイアスの頑健な予測因子とするのは難しい。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。