[论文解读] 3DLinker: An E(3) Equivariant Variational Autoencoder for Molecular Linker Design
3DLinker 是一个条件变分自编码器,联合建模二维分子图和三维坐标以设计连接子分子,预测锚点并生成等变性三维结构。
Deep learning has achieved tremendous success in designing novel chemical compounds with desirable pharmaceutical properties. In this work, we focus on a new type of drug design problem -- generating a small "linker" to physically attach two independent molecules with their distinct functions. The main computational challenges include: 1) the generation of linkers is conditional on the two given molecules, in contrast to generating full molecules from scratch in previous works; 2) linkers heavily depend on the anchor atoms of the two molecules to be connected, which are not known beforehand; 3) 3D structures and orientations of the molecules need to be considered to avoid atom clashes, for which equivariance to E(3) group are necessary. To address these problems, we propose a conditional generative model, named 3DLinker, which is able to predict anchor atoms and jointly generate linker graphs and their 3D structures based on an E(3) equivariant graph variational autoencoder. So far as we know, there are no previous models that could achieve this task. We compare our model with multiple conditional generative models modified from other molecular design tasks and find that our model has a significantly higher rate in recovering molecular graphs, and more importantly, accurately predicting the 3D coordinates of all the atoms.
研究动机与目标
- 解决在未知锚点的两个片段条件下的连接子设计问题。
- 结合三维空间约束和对 E(3) 的等变性,以实现更真实的连接子生成。
- 开发一个条件 VAE,在预测锚点的同时联合生成连接子图和三维坐标。
- 实现可以支持下游任务(如药物相似性预测)的无监督潜在表示。
提出的方法
- 引入一个条件 VAE 框架,输出连接子的非变特征图和等变的三维坐标。
- 在编码和解码过程中使用混合特征信息传递(MF-MP)同时更新不变与等变特征。
- 在 MN-MP 内采用向量 ReLU(VN-MLP)以在坐标计算中保持对 E(3) 的等变性。
- 预测锚点节点、连接子节点类型,并按顺序生成边和坐标,同时进行坐标更新。
- 与 ELBO 目标一起训练,并对锚点、节点类型和边的预测采用 teacher forcing。
实验结果
研究问题
- RQ1一个模型是否能够在对旋转、平移和反射等保持等变性的同时,预测连接子拓扑和三维坐标?
- RQ2在现场预测锚点(而不是假设已知锚点)对恢复率和三维精度有何影响?
- RQ3相较于基线,3D 坐标和等变特征是否同时提升了二维图恢复和三维几何精度?
- RQ4三维约束对下游任务(如药物可药性预测)有何影响?
主要发现
| 指标 | 有效率 (%) | 恢复率 (%) | 通过 2D 筛选 (%) | RMSD | 唯一性 (%) | 新颖性 (%) |
|---|---|---|---|---|---|---|
| 3DLinker (给定锚点) | 99.20 | 94.69 | 90.35 | 0.079 | 29.24 | 32.21 |
| 3DLinker | 98.67 | 93.58 | 90.37 | 0.079 | 29.42 | 32.48 |
| DeLinker+ConfVAE | 98.38 | 81.56 | 89.92 | 1.356 | 44.67 | 39.51 |
| GraphAF+ConfVAE | 34.24 | 20.39 | 82.01 | 1.239 | 84.11 | 78.34 |
| GraphVAE+ConfVAE | 15.07 | 0.56 | 85.88 | 1.056 | 85.52 | 61.48 |
- 3DLinker 在给定锚点时达到高有效性和恢复率:有效性 99.20%,恢复率 94.69%;在未给定锚点时有效性 98.67%,恢复率 93.58。
- 3DLinker 取得低 RMSD(0.079),表明三维坐标预测准确;在给定锚点的设置下,独特性 29.24%,新颖化合物 32.21%。
- 与基线(DeLinker+ConfVAE、GraphAF+ConfVAE、GraphVAE+ConfVAE)相比,3DLinker 在三维几何预测和恢复率方面表现更优,尽管由于三维约束,新颖性/独特性可能降低。
- 该模型在形状与颜色相似性(SC_RDKit)方面也具有竞争力或更好,在评估方法中获得最低的 QED RMSE(0.0833),表明对药物可药性的潜在表征具有意义。
- 消融研究表明等变特征和坐标更新对性能都至关重要,尤以 RMSD 和恢复率为显著。
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。