[论文解读] Drug-Drug Interaction Extraction from Biomedical Text Using Long Short Term Memory Network
本文提出三种基于LSTM的模型——B-LSTM、AB-LSTM和Joint AB-LSTM——用于生物医学文本中的药物-药物相互作用(DDI)分类,利用词嵌入和位置嵌入,无需手工特征。Joint AB-LSTM模型在SemEval-2013 DDI数据集上达到最先进性能,通过有效结合双向LSTM与注意力机制,捕捉上下文与句法依赖关系。
Simultaneous administration of multiple drugs can have synergistic or antagonistic effects as one drug can affect activities of other drugs. Synergistic effects lead to improved therapeutic outcomes, whereas, antagonistic effects can be life-threatening, may lead to increased healthcare cost, or may even cause death. Thus identification of unknown drug-drug interaction (DDI) is an important concern for efficient and effective healthcare. Although multiple resources for DDI exist, they are often unable to keep pace with rich amount of information available in fast growing biomedical texts. Most existing methods model DDI extraction from text as a classification problem and mainly rely on handcrafted features. Some of these features further depend on domain specific tools. Recently neural network models using latent features have been shown to give similar or better performance than the other existing models dependent on handcrafted features. In this paper, we present three models namely, {\it B-LSTM}, {\it AB-LSTM} and {\it Joint AB-LSTM} based on long short-term memory (LSTM) network. All three models utilize word and position embedding as latent features and thus do not rely on explicit feature engineering. Further use of bidirectional long short-term memory (Bi-LSTM) networks allow implicit feature extraction from the whole sentence. The two models, {\it AB-LSTM} and {\it Joint AB-LSTM} also use attentive pooling in the output of Bi-LSTM layer to assign weights to features. Our experimental results on the SemEval-2013 DDI extraction dataset show that the {\it Joint AB-LSTM} model outperforms all the existing methods, including those relying on handcrafted features. The other two proposed LSTM models also perform competitively with state-of-the-art methods.
研究动机与目标
- 为应对生物医学文献快速增长背景下传统知识库难以跟上节奏的药物-药物相互作用(DDI)抽取挑战。
- 开发无需依赖手工特征与领域特定工具的深度学习模型。
- 通过端到端学习,利用词嵌入与位置嵌入作为潜在特征,提升DDI分类性能。
- 探究注意力机制与双向LSTM在建模句子长距离依赖关系中的影响。
- 分析模型在句子长度、实体分离程度以及重复药物提及方面的局限性。
提出的方法
- 使用词嵌入与位置嵌入作为输入特征,避免显式特征工程。
- 采用双向LSTM(Bi-LSTM)网络,从正向与反向句子上下文中捕捉上下文表征。
- 在AB-LSTM与Joint AB-LSTM中引入注意力机制,为隐藏状态分配动态权重,突出对分类关键的词语。
- 在Joint AB-LSTM模型中应用联合注意力机制,优化两个药物提及之间的特征表征。
- 在SemEval-2013 DDI数据集上端到端训练模型,使用交叉熵损失与Softmax分类。
- 通过消融实验评估词嵌入、位置嵌入以及预训练向量的贡献。
实验结果
研究问题
- RQ1端到端的LSTM模型是否能通过学习到的嵌入优于依赖手工特征的传统方法在DDI分类中表现?
- RQ2注意力机制的引入相比标准Bi-LSTM模型,如何提升DDI分类性能?
- RQ3词嵌入与位置嵌入对模型性能的相对贡献如何?
- RQ4句子长度与药物实体间距离如何影响模型预测准确率?
- RQ5模型的主要失败模式是什么,特别是在噪声较大或句子较长的情况下?
主要发现
- Joint AB-LSTM模型在SemEval-2013 DDI分类基准上达到最先进性能,优于所有先前方法,包括使用手工特征的方法。
- 移除位置嵌入或用随机初始化替代预训练词向量会导致性能相对下降4.6%,凸显其重要性。
- 与基于CNN的模型相比,该模型在长句及药物实体间距离较远的情况下表现显著更优。
- 注意力机制成功突出语义相关短语,如“may enhance the effects”与“increase the effects”,证实注意力学习有效。
- 在长句及重复提及药物的情况下,错误预测更频繁,表明对噪声与上下文长度敏感。
- Advice类相互作用最容易分类,而模型在Int(相互作用)与Mechanism类上表现最差,主要因语义模糊与缺乏明确提示。
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。