Skip to main content
QUICK REVIEW

[论文解读] ChatCAD: Interactive Computer-Aided Diagnosis on Medical Image using Large Language Models

Sheng Wang, Zihao Zhao|arXiv (Cornell University)|Feb 14, 2023
Radiomics and Machine Learning in Medical Imaging被引用 78
一句话总结

ChatCAD 将大型语言模型与医学影像 CAD 网络结合,将可视输出翻译为文本,使LLM驱动的摘要、交互式解释和放射科报告的治疗指南成为可能。

ABSTRACT

Large language models (LLMs) have recently demonstrated their potential in clinical applications, providing valuable medical knowledge and advice. For example, a large dialog LLM like ChatGPT has successfully passed part of the US medical licensing exam. However, LLMs currently have difficulty processing images, making it challenging to interpret information from medical images, which are rich in information that supports clinical decisions. On the other hand, computer-aided diagnosis (CAD) networks for medical images have seen significant success in the medical field by using advanced deep-learning algorithms to support clinical decision-making. This paper presents a method for integrating LLMs into medical-image CAD networks. The proposed framework uses LLMs to enhance the output of multiple CAD networks, such as diagnosis networks, lesion segmentation networks, and report generation networks, by summarizing and reorganizing the information presented in natural language text format. The goal is to merge the strengths of LLMs' medical domain knowledge and logical reasoning with the vision understanding capability of existing medical-image CAD models to create a more user-friendly and understandable system for patients compared to conventional CAD systems. In the future, LLM's medical knowledge can be also used to improve the performance of vision-based medical-image CAD models.

研究动机与目标

  • 激发将LLMs与基于视觉的CAD系统结合,以利用医学知识和推理来提升放射科报告。
  • 将CAD输出翻译为文本,以桥接视觉与语言,促进LLM推理。
  • 提升报告质量,为患者提供互动解释和医疗建议。
  • 在胸部X光数据集上展示相对于最新研究的报告生成改进。

提出的方法

  • 使用多种CAD网络(分类、病灶分割和报告生成)处理胸部X光图像。
  • 将CAD输出(张量/掩模)转换为自然语言描述以形成提示。
  • 使用LLM(GPT-3/ChatGPT)总结跨网络结果并生成经过润色的放射科报告。
  • 设计将分数转换为严重程度描述的提示,以与临床语言保持一致。
  • 在使用CheXpert标签的MIMIC-CXR数据集上,使用精确度、召回率和F1值对比基线评估报告质量。
Figure 2 : Interactive CAD with LLMs. This example uses the ChatGPT as LLM.
Figure 2 : Interactive CAD with LLMs. This example uses the ChatGPT as LLM.

实验结果

研究问题

  • RQ1LLMs 在接收来自多个 CAD 网络的结构化输出时,能否改善放射科报告质量?
  • RQ2提示设计如何影响LLM驱动的报告质量和诊断准确性?
  • RQ3使用不同大小的LLM(以及ChatGPT)对诊断性能指标有何影响?
  • RQ4基于图像发现的互动LLM对话能否提供有用的解释和治疗建议?

主要发现

  • ChatCAD 在五个胸部X线观察的诊断性能指标(F1)上优于两个最先进的报告生成基线。
  • 基于GPT-3的提示在五个观察项中平均F1和召回率高于CvT2DistilGPT2和R2GenCMN,Edema和Consolidation方面 gains 明显。
  • ChatGPT 的平均F1 为 0.605,优于 text-davinci-003(0.591)和较小模型(例如较小的GPT-3尺寸的平均值在0.471–0.508)。
  • 更大规模的LLM 提供更长、能力更强的报告以及更好的诊断性能,凸显模型规模在医学推理任务中的作用。
  • ChatCAD 能实现交互式解释和医疗建议风格的对话,潜在减少就诊成本并改善在线医疗体验。
Figure 3 : Prompts that bridge between tensor and text. We show three different prompt designs.
Figure 3 : Prompts that bridge between tensor and text. We show three different prompt designs.

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。