QUICK REVIEW

[论文解读] Reconciling Methodological Paradigms: Employing Large Language Models as Novice Qualitative Research Assistants in Talent Management Research

Sreyoshi Bhaduri, Satya Kapoor|arXiv (Cornell University)|Aug 20, 2024

Human Resource and Talent Management被引用 5

一句话总结

论文展示了如何使用带有大型语言模型的检索增强生成（RAG）作为初学者定性研究助手，在人才管理领域对半结构化访谈数据进行主题建模，优于其他 prompting 方法，并提供保持传统定性研究严谨性的指南。

ABSTRACT

Qualitative data collection and analysis approaches, such as those employing interviews and focus groups, provide rich insights into customer attitudes, sentiment, and behavior. However, manually analyzing qualitative data requires extensive time and effort to identify relevant topics and thematic insights. This study proposes a novel approach to address this challenge by leveraging Retrieval Augmented Generation (RAG) based Large Language Models (LLMs) for analyzing interview transcripts. The novelty of this work lies in strategizing the research inquiry as one that is augmented by an LLM that serves as a novice research assistant. This research explores the mental model of LLMs to serve as novice qualitative research assistants for researchers in the talent management space. A RAG-based LLM approach is extended to enable topic modeling of semi-structured interview data, showcasing the versatility of these models beyond their traditional use in information retrieval and search. Our findings demonstrate that the LLM-augmented RAG approach can successfully extract topics of interest, with significant coverage compared to manually generated topics from the same dataset. This establishes the viability of employing LLMs as novice qualitative research assistants. Additionally, the study recommends that researchers leveraging such models lean heavily on quality criteria used in traditional qualitative research to ensure rigor and trustworthiness of their approach. Finally, the paper presents key recommendations for industry practitioners seeking to reconcile the use of LLMs with established qualitative research paradigms, providing a roadmap for the effective integration of these powerful, albeit novice, AI tools in the analysis of qualitative datasets within talent

研究动机与目标

在人才管理研究中推动将定性洞察与可扩展的AI工具结合。
展示一个基于RAG的实用LLM工作流程，用于分析半结构化访谈文本。
用标准指标将基于LLM的主题建模与人工定性编码进行基准比较。
强调在AI辅助定性分析中保持严格性、可信度和透明度的方法论考虑。
为实践者提供将AI工具与定性范式调和的可操作性建议。

提出的方法

使用 LangChain 构建动态提示（少量示例、推理链），引导一个LLM（Anthropic Claude2）对转写文本执行主题建模。
推广四种提示策略：零-shot、少量示例、推理链，以及检索增强生成（RAG）。
将LLM视为初学研究者，以专家指导和迭代主题提取来约束提示。
将访谈转写文本作为RAG的自定义知识库，以防信息超载和减少幻觉。
用精确率、召回率和F1以及基于余弦的词级匹配，将生成的主题与人工编码的金标准进行评估。
比较不同提示策略下的嵌入模型（DistilBert、BERT、Roberta）的鲁棒性。

Figure 1 . Comparison across prompting approaches

实验结果

研究问题

RQ1带有RAG的LLM是否能可靠地从人才管理的半结构化访谈数据中提取主题？
RQ2不同提示策略（零-shot、少量示例、推理链、RAG）在主题识别的精确度、召回率和F1方面的比较如何？
RQ3将LLM视为初学研究者是否有助于改善提示设计和主题解释性？
RQ4在将LLMs与传统定性分析结合时，哪些最佳实践可确保严格性与可信度？

主要发现

嵌入模型	提示技巧	精确度 (%)	召回率 (%)	F1-分数 (%)
Distillbert-base-uncased	推理链	67	62	64
Distillbert-base-uncased	少量示例	72	67	70
Distillbert-base-uncased	零-shot	68	66	67
Distillbert-base-uncased	RAG	79	80	79
Bert-base-uncased	推理链	56	48	52
Bert-base-uncased	少量示例	64	56	60
Bert-base-uncased	零-shot	59	55	57
Bert-base-uncased	RAG	70	70	70
Roberta-large	推理链	89	85	87
Roberta-large	少量示例	90	87	88
Roberta-large	零-shot	89	86	88
Roberta-large	RAG	92	91	91

在精确度、召回率和F1方面，RAG在所有嵌入模型上持续超越零-shot、少量示例和推理链提示。
Roberta-large 与 RAG 的指标最高：精确度92%，召回率91%，F1 91%。
在各模型中，与基于LDA的方法相比，RAG 能提供更强的主题识别和更丰富的情境化主题。
将LLM视为初学研究者有助于设计更能引导主题提取和解释的提示。
使用聚焦的检索策略可缓解信息过载和幻觉现象，提升与金标准主题的一致性。

Figure 2 . Sample of the interview transcript

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。