Skip to main content
QUICK REVIEW

[论文解读] Learning Subpocket Prototypes for Generalizable Structure-based Drug Design

Zaixi Zhang, Qi Liu|arXiv (Cornell University)|May 22, 2023
Computational Drug Discovery Methods被引用 8
一句话总结

DrugGPS 学习子口袋原型,并使用带原型增强的 motif 生成的分层 3D 图变换器,将基于结构的药物设计推广到未见口袋,在 OOD 设置下超越基线。

ABSTRACT

Generating molecules with high binding affinities to target proteins (a.k.a. structure-based drug design) is a fundamental and challenging task in drug discovery. Recently, deep generative models have achieved remarkable success in generating 3D molecules conditioned on the protein pocket. However, most existing methods consider molecular generation for protein pockets independently while neglecting the underlying connections such as subpocket-level similarities. Subpockets are the local protein environments of ligand fragments and pockets with similar subpockets may bind the same molecular fragment (motif) even though their overall structures are different. Therefore, the trained models can hardly generalize to unseen protein pockets in real-world applications. In this paper, we propose a novel method DrugGPS for generalizable structure-based drug design. With the biochemical priors, we propose to learn subpocket prototypes and construct a global interaction graph to model the interactions between subpocket prototypes and molecular motifs. Moreover, a hierarchical graph transformer encoder and motif-based 3D molecule generation scheme are used to improve the model's performance. The experimental results show that our model consistently outperforms baselines in generating realistic drug candidates with high affinities in challenging out-of-distribution settings.

研究动机与目标

  • Motivate structure-based drug design (SBDD) and address generalization to unseen pockets.
  • Leverage subpocket-level biochemical priors to build a generalizable model.
  • Develop a hierarchical encoder and a global interaction graph to inform generation.
  • Generate molecules motif-by-motif with prototype-augmented context to improve realism and affinity.

提出的方法

  • Construct atom- and residue-level graphs to encode pocket-ligand context using a hierarchical 3D graph transformer.
  • Define subpocket prototypes by clustering subpocket embeddings and build a global prototype–motif interaction graph with TF-IDF weighted edges.
  • Introduce a prototype-augmented motif generation pipeline where generation is guided by subpocket embeddings and global interactions.
  • Generate ligands motif-by-motif with focal motif selection, motif attachment prediction, and rotation angle prediction.
  • Train with multi-task losses including focal atom/motif prediction, motif type, attachment, torsion angle (von Mises), and distance initialization.
  • Mask-and-recover training strategy over molecular motifs to learn conditional generation p(G^mol|G^pro).

实验结果

研究问题

  • RQ1Can subpocket-level prototypes enable generalization to unseen protein pockets in structure-based drug design?
  • RQ2Does a global interaction graph between subpocket prototypes and molecular motifs improve generation quality and affinity on out-of-distribution pockets?
  • RQ3How does hierarchical encoding of atom- and residue-level pocket information affect generation realism and drug-likeness?
  • RQ4Is motif-by-motif generation with focal-motif guidance more efficient and valid than atom-by-atom generation in SBDD?

主要发现

MethodsVina Score (kcal/mol, ↓)High Affinity (↑)QED (↑)SA (↑)LogPLip. (↑)Sim. Train (↓)Div. (↑)Time (↓)
Testset-7.145 \u00142 2.24-0.465 \u00142 0.250.736 \u00142 0.120.941 \u00142 2.254.468 \u00142 1.54---
LiGAN-6.032 \u00142 1.890.194 \u00142 0.260.365 \u00142 0.270.615 \u00142 0.20-0.015 \u00142 2.484.002 \u00142 0.920.410 \u00142 0.220.667 \u00142 0.151819.8 \u00142 560.7
AR-6.114 \u00142 1.660.235 \u00142 0.230.483 \u00142 0.180.662 \u00142 0.190.210 \u00142 1.764.688 \u00142 0.450.394 \u00142 0.210.650 \u00142 0.1315986.4 \u00142 9851.0
GraphBP-6.745 \u00142 1.820.378 \u00142 0.290.455 \u00142 0.190.710 \u00142 0.180.457 \u00142 2.104.783 \u00142 0.340.378 \u00142 0.260.659 \u00142 0.121162.8 \u00142 438.5
Pocket2Mol-6.869 \u00142 2.190.413 \u00142 0.230.524 \u00142 0.240.726 \u00142 0.210.830 \u00142 2.174.892 \u00142 0.220.364 \u00142 0.190.695 \u00142 0.172827.3 \u00142 1456.8
FLAG-6.956 \u00142 1.920.445 \u00142 0.220.552 \u00142 0.200.737 \u00142 0.190.745 \u00142 2.094.904 \u00142 0.140.388 \u00142 0.180.704 \u00142 0.181289.1 \u00142 378.0
DrugGPS-7.276 \u00142 2.140.565 \u00142 0.230.613 \u00142 0.220.743 \u00142 0.180.913 \u00142 2.154.917 \u00142 0.120.360 \u00142 0.210.681 \u00142 0.151007.8 \u00142 554.1
  • DrugGPS outperforms baselines (LiGAN, AR, GraphBP, Pocket2Mol, FLAG) on generated molecules with higher binding affinity and drug-likeness under out-of-distribution splits.
  • Under pocket-based clustered splits, DrugGPS maintains high affinity and generates diverse, drug-like molecules with realistic substructures.
  • The global subpocket prototype–motif interaction graph and prototype augmentation improve generation by encoding cross-pocket knowledge.
  • ablations show hierarchical encoding and the interaction graph are crucial for performance.
  • 100% validity of generated molecules is achieved due to chemically valid attachment pruning and RDKit checks.

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。