[论文解读] SGPT: GPT Sentence Embeddings for Semantic Search
SGPT 展示了如何利用仅解码器的 GPT 模型来生成高质量的句子嵌入以用于语义检索,在 BEIR 上实现了最先进的结果,参数显著少于竞争的巨型模型,通过偏置微调(BitFit)用于 Bi-Encoders 和对数概率提示的 Cross-Encoders。
Decoder transformers have continued increasing in scale reaching hundreds of billions of parameters. Due to their scale the same decoder sets state-of-the-art results on various language tasks via prompting or fine-tuning. Yet, these large foundation models remain unusable for the related fields of semantic search and sentence embeddings. This prevents possibly new state-of-the-art results and forces organizations to train and maintain separate models. To this end, we propose SGPT to use decoders for sentence embeddings and semantic search via prompting or fine-tuning. At 5.8 billion parameters SGPT improves on the previously best sentence embeddings by a margin of 7% and outperforms a concurrent method with 175 billion parameters as measured on the BEIR search benchmark. Code, models and result files are freely available at https://github.com/Muennighoff/sgpt.
研究动机与目标
- Motivate the use of decoder-only transformers for semantic search and sentence embeddings.
- Develop SGPT-BE (Bi-Encoder) with position-weighted pooling and BitFit bias-only fine-tuning.
- Develop SGPT-CE (Cross-Encoder) using log-probability extraction from pre-trained GPT models.
- Evaluate SGPT variants on BEIR and USEB benchmarks across asymmetric and symmetric search tasks.
- Provide resourceful open-source code and models for practitioners.
提出的方法
- Use decoder-only transformers to generate sentence embeddings for semantic search.
- In SGPT-BE, apply position-weighted mean pooling to hidden states.
- Fine-tune only bias parameters (BitFit) and freeze the rest of the model.
- In SGPT-CE, extract log probabilities from pre-trained GPT models via prompting for unsupervised cross-encoder scoring.
- Evaluate across asymmetric and symmetric search benchmarks (BEIR, USEB) and compare to encoder-based baselines and OpenAI endpoints.
实验结果
研究问题
- RQ1Can decoder-only GPT models produce competitive sentence embeddings for semantic search when fine-tuned selectively?
- RQ2What pooling strategy yields best embeddings for GPT-based Bi-Encoders in semantic search?
- RQ3How does bias-only fine-tuning (BitFit) compare to full fine-tuning in SGPT-BE versus SBERT baselines?
- RQ4How do SGPT-CE and SGPT-BE scale in performance with model size on BEIR and USEB datasets?
主要发现
- SGPT-BE-5.8B with position-weighted mean pooling and BitFit achieves state-of-the-art results on BEIR and USEB among sentence embeddings when compared by size and setting.
- SGPT-CE-6.1B, using log probabilities with prompts, achieves unsupervised state-of-the-art performance on BEIR, though higher parameter counts increase latency.
- At 5.8B parameters, SGPT-BE attains about a 7% improvement in embeddings quality over previous best sentence embeddings.
- SGPT-CE-6.1B reaches around 80% of the maximum possible performance for re-ranking Top-100, illustrating scale benefits under re-ranking bottlenecks.
- Compared to OpenAI endpoints, SGPT variants provide competitive or superior results in many BEIR and USEB tasks while offering open-source alternatives and full control over prompts and re-ranking strategy.
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。