QUICK REVIEW

[论文解读] GAMedX: Generative AI-based Medical Entity Data Extractor Using Large Language Models

Mohammed-Khalil Ghali, Abdelrahman Farrag|arXiv (Cornell University)|May 31, 2024

Topic Modeling被引用 5

一句话总结

GAMedX 使用开源大语言模型（Mistral 7B、Gemma 7B）并采用统一提示/Pydantic 架构，从非结构化文本中提取医疗实体，获得高 ROUGE 分数和对 VAERS 语义分析的强力效果。

ABSTRACT

In the rapidly evolving field of healthcare and beyond, the integration of generative AI in Electronic Health Records (EHRs) represents a pivotal advancement, addressing a critical gap in current information extraction techniques. This paper introduces GAMedX, a Named Entity Recognition (NER) approach utilizing Large Language Models (LLMs) to efficiently extract entities from medical narratives and unstructured text generated throughout various phases of the patient hospital visit. By addressing the significant challenge of processing unstructured medical text, GAMedX leverages the capabilities of generative AI and LLMs for improved data extraction. Employing a unified approach, the methodology integrates open-source LLMs for NER, utilizing chained prompts and Pydantic schemas for structured output to navigate the complexities of specialized medical jargon. The findings reveal significant ROUGE F1 score on one of the evaluation datasets with an accuracy of 98\%. This innovation enhances entity extraction, offering a scalable, cost-effective solution for automated forms filling from unstructured data. As a result, GAMedX streamlines the processing of unstructured narratives, and sets a new standard in NER applications, contributing significantly to theoretical and practical advancements beyond the medical technology sphere.

研究动机与目标

解决从非结构化的医疗叙述和记录中提取结构化信息的挑战。
使用开源 LLM 构建一个具统一输出结构的命名实体识别（NER）系统。
确保适合医疗 IT 环境的成本高效集成。
在考虑隐私约束的同时，在真实和合成医疗数据集上评估方法。

提出的方法

使用统一提示和 Pydantic 架构包装开源 LLM，使其输出结构化的 NER 数据。
使用 LangChain 递归文本分割器对文档进行分块，并将非英语文本翻译为英语。
使用 Mistral 7B 和 Gemma 7B 应用上下文学习（一-shot 和 few-shot），优化用于医疗数据提取的提示。
使用 ROUGE-1 F1 和 ROUGE-L F1 进行评估；对于 VAERS 使用嵌入（BGE 和 Instruct Embeddings）的语义分析和 t-SNE 可视化。
对比零-shot 与 few-shot 设置，以评估跨数据集的鲁棒性。

实验结果

研究问题

RQ1是否可以使用带统一提示的开源 LLM 在非结构化叙述上实现高准确性的医学 NER？
RQ2一-shot 与 few-shot 提示对医疗记录和 VAERS 报告的提取质量有何影响？
RQ3基于 ROUGE 的分析与语义嵌入分析在评估 VAERS 提取性能方面有何差异？

主要发现

模型	策略	Competition 数据集 ROUGE-1 F1	Competition 数据集 ROUGE-L F1	VAERS 数据集 ROUGE-1 F1	VAERS 数据集 ROUGE-L F1
Mistral	One Shot	97%	98%	58%	57%
Mistral	Few Shots	98%	98%	63%	62%
Gemma	One Shot	97%	97%	60%	59%
Gemma	Few Shots	98%	98%	63%	62%

在 Competition 数据集上，ROUGE-1 F1 和 ROUGE-L F1 分数在 97% 到 98% 之间，跨模型和 one-/few-shot 设置。
在 VAERS 数据集上，ROUGE-1 F1 和 ROUGE-L F1 分数根据模型和 shot 设置在 57% 到 63% 之间。
对于 Mistral 7B 和 Gemma 7B，Few-shot 提示对 VAERS 性能有适度提升。
使用 BGE 和 Instruct Embeddings 配合 t-SNE 的语义分析有助于解释 VAERS 输出，超出 ROUGE 指标。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。