Skip to main content
QUICK REVIEW

[论文解读] Feature Extraction for Machine Learning Based Crackle Detection in Lung Sounds from a Health Survey

Morten Grønnesby, Juan Carlos Aviles Solis|arXiv (Cornell University)|May 31, 2017
Phonocardiography and Auscultation Techniques参考文献 21被引用 27
一句话总结

该论文提出了一种低维、计算高效的机器学习流程,用于利用大规模健康调查数据集自动检测肺部听诊音中的爆裂音。通过提取结合时域和频域特征的5维特征向量,并使用RBF核的SVM进行训练,该方法在精度86%和召回率84%方面表现优于以往方法,能够在标准硬件上实现实时部署。

ABSTRACT

In recent years, many innovative solutions for recording and viewing sounds from a stethoscope have become available. However, to fully utilize such devices, there is a need for an automated approach for detecting abnormal lung sounds, which is better than the existing methods that typically have been developed and evaluated using a small and non-diverse dataset. We propose a machine learning based approach for detecting crackles in lung sounds recorded using a stethoscope in a large health survey. Our method is trained and evaluated using 209 files with crackles classified by expert listeners. Our analysis pipeline is based on features extracted from small windows in audio files. We evaluated several feature extraction methods and classifiers. We evaluated the pipeline using a training set of 175 crackle windows and 208 normal windows. We did 100 cycles of cross validation where we shuffled training sets between cycles. For all the division between training and evaluation was 70%-30%. We found and evaluated a 5-dimenstional vector with four features from the time domain and one from the spectrum domain. We evaluated several classifiers and found SVM with a Radial Basis Function Kernel to perform best. Our approach had a precision of 86% and recall of 84% for classifying a crackle in a window, which is more accurate than found in studies of health personnel. The low-dimensional feature vector makes the SVM very fast. The model can be trained on a regular computer in 1.44 seconds, and 319 crackles can be classified in 1.08 seconds. Our approach detects and visualizes individual crackles in recorded audio files. It is accurate, fast, and has low resource requirements. It can be used to train health personnel or as part of a smartphone application for Bluetooth stethoscopes.

研究动机与目标

  • 开发一种自动化的、可扩展的方法,用于在大规模健康调查中通过听诊器记录的肺部听诊音中检测爆裂音。
  • 克服以往方法依赖小规模、非多样化数据集和人工标注的局限性。
  • 设计一种低资源、实时的解决方案,适用于培训医疗人员或集成到基于智能手机的蓝牙听诊器应用中。
  • 通过交叉验证评估特征提取方法和分类器,以实现爆裂音检测的最佳性能。

提出的方法

  • 该方法通过将肺部听诊音音频文件划分为小的、固定长度的窗口进行分析。
  • 每个窗口提取一个5维特征向量,包含四个时域特征(如过零率、能量、熵、峰度)和一个频域特征(如频谱质心)。
  • 采用100折交叉验证,结合随机的70%-30%训练-测试划分,以确保评估的稳健性。
  • 基于性能比较,选择使用径向基函数(RBF)核的SVM作为最优分类器。
  • 模型在175个爆裂音窗口和208个正常窗口上进行训练,所有训练集在各轮次中均经过洗牌。
  • 系统支持在音频文件中可视化单个爆裂音,并支持实时推理。

实验结果

研究问题

  • RQ1在肺部听诊音中检测爆裂音时,时域与频域特征的最佳组合是什么?
  • RQ2在大规模、多样化的健康调查数据集上,不同分类器在检测爆裂音方面的性能如何比较?
  • RQ3低维特征向量是否能够实现高精度、低资源的爆裂音检测,从而适用于实时部署?
  • RQ4与现有方法相比,该方法在专家标注数据上的精度和召回率表现如何?

主要发现

  • 所提出的5维特征向量(包含四个时域特征和一个频域特征)在爆裂音检测中表现最佳。
  • 使用RBF核的SVM优于所有其他分类器,在爆裂音分类中达到86%的精度和84%的召回率。
  • 该模型在标准计算机上仅需1.44秒即可完成训练,展现出极高的计算效率。
  • 系统在1.08秒内分类出319个爆裂音,证实其适用于实时应用。
  • 该方法的准确率超过涉及医疗人员的研究报告结果,显示出强大的临床潜力。
  • 低维特征空间和高效分类器使得该方法能够在智能手机等低资源设备上部署。

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。