Skip to main content
QUICK REVIEW

[论文解读] Structure-based Drug Design with Equivariant Diffusion Models

Arne Schneuing, Charles B. Harris|arXiv (Cornell University)|Oct 24, 2022
Computational Drug Discovery Methods被引用 98
一句话总结

DiffSBDD 提出了一种 SE(3)-equivariant 的 3D 条件扩散模型,用于在蛋白质口袋条件下生成新型配体,并具备 inpainting 与联合口袋–配体生成能力,适用于基于结构的药物设计。

ABSTRACT

Structure-based drug design (SBDD) aims to design small-molecule ligands that bind with high affinity and specificity to pre-determined protein targets. Generative SBDD methods leverage structural data of drugs in complex with their protein targets to propose new drug candidates. These approaches typically place one atom at a time in an autoregressive fashion using the binding pocket as well as previously added ligand atoms as context in each step. Recently a surge of diffusion generative models has entered this domain which hold promise to capture the statistical properties of natural ligands more faithfully. However, most existing methods focus exclusively on bottom-up de novo design of compounds or tackle other drug development challenges with task-specific models. The latter requires curation of suitable datasets, careful engineering of the models and retraining from scratch for each task. Here we show how a single pre-trained diffusion model can be applied to a broader range of problems, such as off-the-shelf property optimization, explicit negative design, and partial molecular design with inpainting. We formulate SBDD as a 3D-conditional generation problem and present DiffSBDD, an SE(3)-equivariant diffusion model that generates novel ligands conditioned on protein pockets. Our in silico experiments demonstrate that DiffSBDD captures the statistics of the ground truth data effectively. Furthermore, we show how additional constraints can be used to improve the generated drug candidates according to a variety of computational metrics. These results support the assumption that diffusion models represent the complex distribution of structural data more accurately than previous methods, and are able to incorporate additional design objectives and constraints changing nothing but the sampling strategy.

研究动机与目标

  • 通过在三维蛋白质口袋条件下生成药物样配体,推进基于结构的药物设计。
  • 通过使用联合的 SE(3)-equivariant 扩散框架来解决逐步生成的局限性。
  • 在不重新训练的情况下,使 inpainting、骨架跳变(scaffold hopping)和基于性质的优化等灵活任务成为可能。
  • 在具有现实蛋白–配体情境的基准数据集上验证性能。

提出的方法

  • 将 SBDD 公式化为一个 3D 条件扩散任务,并训练一个 SE(3)-equivariant 的去噪扩散概率模型 (DDPM) 来重构配体坐标和原子类型。
  • 将蛋白质和配体表示为经过 SE(3)-equivariant 图神经网络 (EGNNs) 处理的 3D 图。
  • 通过在去噪过程中将蛋白质口袋节点固定来实现口袋条件采样,从而实现真正的 3D 条件。
  • 提供一种学习 p(zL, zP) 的替代联合生成模型,并通过修改后的推断过程实现条件采样。
  • 通过在采样过程中替换固定的子结构来实现 inpainting,从而在不重新训练的情况下实现骨架跳变和片段增长。
  • 引入一个打破对称性的坐标更新修改,以处理镜像并确保一致的 3D 生成。

实验结果

研究问题

  • RQ1一个 SE(3)-equivariant 扩散模型是否能够在特定蛋白质口袋条件下生成高亲和力、药物样的配体?
  • RQ2与口袋条件方法相比,联合生成口袋和配体是否能改善对接分数和多样性?
  • RQ3基于扩散的 inpainting 是否能够在不针对任务特定数据重新训练的情况下实现灵活的骨架跳跃和基于片段的设计?
  • RQ4DiffSBDD 在 CrossDocked、Binding MOAD 等现实数据集上的新药先导配体设计和 lead 优化表现如何?

主要发现

  • DiffSBDD 能在给定的蛋白质口袋下生成新颖且多样的配体,且对接分数具有竞争力。
  • 联合口袋–配体生成结合 inpainting 往往比条件方法获得更高的对接分数和改进的配体性质。
  • 全原子上下文(相对于 Cα 表示)提高了对接分数和与对接构象的一致性。
  • 基于 inpainting 的设计实现了骨架跳跃、扩展、片段合并和片段生长,而无需在专门数据集上重新训练。
  • 该方法通过在种子分子周围导航局部化学空间并优化如 QED 和 SA 等性质,展示了在 lead 优化方面的灵活性。

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。