Skip to main content
QUICK REVIEW

[论文解读] Contrast Everything: A Hierarchical Contrastive Framework for Medical Time-Series

Yihe Wang, Yu Han|arXiv (Cornell University)|Oct 21, 2023
Machine Learning in Healthcare被引用 16
一句话总结

COMET 是一个自监督的分层对比学习框架,在医学时间序列中利用观测、样本、试验和患者层级来学习鲁棒表示,尤其在标签稀缺时。

ABSTRACT

Contrastive representation learning is crucial in medical time series analysis as it alleviates dependency on labor-intensive, domain-specific, and scarce expert annotations. However, existing contrastive learning methods primarily focus on one single data level, which fails to fully exploit the intricate nature of medical time series. To address this issue, we present COMET, an innovative hierarchical framework that leverages data consistencies at all inherent levels in medical time series. Our meticulously designed model systematically captures data consistency from four potential levels: observation, sample, trial, and patient levels. By developing contrastive loss at multiple levels, we can learn effective representations that preserve comprehensive data consistency, maximizing information utilization in a self-supervised manner. We conduct experiments in the challenging patient-independent setting. We compare COMET against six baselines using three diverse datasets, which include ECG signals for myocardial infarction and EEG signals for Alzheimer's and Parkinson's diseases. The results demonstrate that COMET consistently outperforms all baselines, particularly in setup with 10% and 1% labeled data fractions across all datasets. These results underscore the significant impact of our framework in advancing contrastive representation learning techniques for medical time series. The source code is available at https://github.com/DL4mHealth/COMET.

研究动机与目标

  • 解决医学时间序列分析中的标签匮乏问题。
  • 充分利用医学时间序列的分层结构(观测、样本、试验、患者)。
  • 开发多层对比损失和一个整体灵活的训练目标。
  • 在患者独立设置下在 EEG/ECG 数据集上展示下游任务的改进。

提出的方法

  • 为对比学习定义四个数据层级:观测、样本、试验和患者。
  • 设计四个相应的对比块,具有层级特定的正/负样本对。
  • 引入层级特定的损失:L_O(观测)、L_S(样本)、L_R(试验)、L_P(患者)。
  • 将它们组合成一个单一损失 L = λ1 L_O + λ2 L_S + λ3 L_R + λ4 L_P,具备可调 λ。
  • 使用共享编码器 G 和增广策略在不同层级生成正样本对。
  • 通过将其 λ 设置为零来启用/禁用任一层级,使其可灵活适应数据集。
Figure 1 : Structure of medical time series. Medical time series commonly have four levels (coarse to fine): patient, trial, sample, and observation. An observation is a single value in univariate time series and a vector in multivariate time series.
Figure 1 : Structure of medical time series. Medical time series commonly have four levels (coarse to fine): patient, trial, sample, and observation. An observation is a single value in univariate time series and a vector in multivariate time series.

实验结果

研究问题

  • RQ1分层对比框架是否能够有效利用医学时间序列的所有层级(观测、样本、试验、患者)进行自监督预训练?
  • RQ2在标注数据有限的情况下,COMET 是否在不同疾病和模态(EEG/ECG)下提升下游性能?
  • RQ3观测层级和样本层级的初级与更高层级(试验/患者)的一致性在表示学习中如何交互?
  • RQ4患者独立评估是否可行且有益于验证跨受试者的鲁棒性?

主要发现

  • COMET 在患者独立设置下始终优于六个基线,在三组数据集上表现出色。
  • 在基于 EEG 的阿尔茨海默病检测中,COMET 在标注数据为 10% 和 1% 时分别比 SOTA 基线的 F1 分数高出 14% 和 13%。
  • 在心肌梗死检测(ECG)中,COMET 在 10% 和 1% 标签下分别比 SOTA 高出 0.17% 和 2.66% 的 F1。
  • 在基于 EEG 的帕金森病诊断中,COMET 在 10% 和 1% 标签下分别高出 2% 和 8% 的 F1。
  • 在有限标签的完整微调(10% 和 1%)下,COMET 在多个数据集上显著超越最佳基线。
  • COMET 展示了其多层自监督预训练策略在多样化的医学时间序列任务中的稳定性和有效性。
Figure 2 : Overview of COMET approach. Our COMET model consists of four contrastive blocks, each illustrating the formulation of positive pairs and negative pairs at different data levels. In the observation-level contrastive, an observation $\bm{x}_{i,t}$ and its augmented view $\bm{\widetilde{x}}_
Figure 2 : Overview of COMET approach. Our COMET model consists of four contrastive blocks, each illustrating the formulation of positive pairs and negative pairs at different data levels. In the observation-level contrastive, an observation $\bm{x}_{i,t}$ and its augmented view $\bm{\widetilde{x}}_

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。