QUICK REVIEW

[论文解读] Reinforced Genetic Algorithm for Structure-based Drug Design

Tianfan Fu, Wenhao Gao|arXiv (Cornell University)|Nov 28, 2022

Computational Drug Discovery Methods被引用 27

一句话总结

RGA 将强化学习与基于三维靶标-配体结构引导的遗传算法相结合，以改善结构基础药物设计中的对接优化；多靶点的预训练和知识转移提升性能与鲁棒性。

ABSTRACT

Structure-based drug design (SBDD) aims to discover drug candidates by finding molecules (ligands) that bind tightly to a disease-related protein (targets), which is the primary approach to computer-aided drug discovery. Recently, applying deep generative models for three-dimensional (3D) molecular design conditioned on protein pockets to solve SBDD has attracted much attention, but their formulation as probabilistic modeling often leads to unsatisfactory optimization performance. On the other hand, traditional combinatorial optimization methods such as genetic algorithms (GA) have demonstrated state-of-the-art performance in various molecular optimization tasks. However, they do not utilize protein target structure to inform design steps but rely on a random-walk-like exploration, which leads to unstable performance and no knowledge transfer between different tasks despite the similar binding physics. To achieve a more stable and efficient SBDD, we propose Reinforced Genetic Algorithm (RGA) that uses neural models to prioritize the profitable design steps and suppress random-walk behavior. The neural models take the 3D structure of the targets and ligands as inputs and are pre-trained using native complex structures to utilize the knowledge of the shared binding physics from different targets and then fine-tuned during optimization. We conduct thorough empirical studies on optimizing binding affinity to various disease targets and show that RGA outperforms the baselines in terms of docking scores and is more robust to random initializations. The ablation study also indicates that the training on different targets helps improve performance by leveraging the shared underlying physics of the binding processes. The code is available at https://github.com/futianfan/reinforced-genetic-algorithm.

研究动机与目标

通过整合蛋白质结构信息来解决传统GA在结构基础药物设计中的低效和不稳定性。
将进化过程重新表述为进化马尔可夫决策过程（EMDP），以使强化学习成为可能。
开发靶标-配体等变神经网络，利用3D结构数据引导交叉和变异。
在原生蛋白-配体复合物上进行预训练，并实现跨靶点的知识转移，以捕捉共同的结合物理规律。
展示在包括SARS-CoV-2主蛋白酶在内的多种疾病靶点上提高的对接分数和鲁棒性。

提出的方法

将GA建模为具有种群级状态和对接分数奖励的进化马尔可夫决策过程（EMDP）。
使用两个策略网络引导交叉（两步父代选择）和两个策略网络引导变异（父代选择和反应选择）。
使用E(3)等变神经网络处理靶标-配体复合物并输出动作概率。
在3D结合亲和任务上使用CrossDocked2020数据对ENNs进行预训练，以捕捉共享的结合物理规律，然后在优化过程中进行微调。
用策略梯度（REINFORCE）优化策略，以最大化期望的对接分数改进。
以AutoDock Vina作为对接预言机，并通过化学意义的单体和双体分子反应设计变异，以确保可合成性。

实验结果

研究问题

RQ1强化学习引导的GA是否能在对接分数优化方面超越基线的结构基础设计方法？
RQ2利用靶标结构信息是否能降低随机性并提高多次试验的鲁棒性？
RQ3在原生复合物上的预训练和跨靶点的知识转移是否能提升SBDD的性能？
RQ4与仅局部变异的RL方法相比，加入长距离交叉如何影响优化？

主要发现

方法	TOP-100	TOP-10	TOP-1	Nov	Div	QED	SA
Screening	-9.351 b1 0.643	-10.433 b1 0.563	-11.400 b1 0.630	0.0 b1 0.0%	0.858 b1 0.005	0.678 b1 0.022	2.689 b1 0.077
MARS	-7.758 b1 0.612	-8.875 b1 0.711	-9.257 b1 0.791	100.0 b1 0.0%	0.877 b1 0.001	0.709 b1 0.008	2.450 b1 0.034
MolDQN	-6.287 b1 0.396	-7.043 b1 0.487	-7.501 b1 0.402	100.0 b1 0.0%	0.877 b1 0.009	0.170 b1 0.024	5.833 b1 0.182
GEGL	-9.064 b1 0.920	-9.91 b1 0.990	-10.45 b1 1.040	100.0 b1 0.0%	0.853 b1 0.003	0.643 b1 0.014	2.99 b1 0.054
REINVENT	-10.181 b1 0.441	-11.234 b1 0.632	-12.010 b1 0.833	100.0 b1 0.0%	0.857 b1 0.011	0.445 b1 0.058	2.596 b1 0.116
RationaleRL	-9.233 b1 0.920	-10.834 b1 0.856	-11.642 b1 1.102	100.0 b1 0.0%	0.717 b1 0.025	0.315 b1 0.023	2.919 b1 0.126
JTVAE	-9.291 b1 0.702	-10.242 b1 0.839	-10.963 b1 1.133	98.0 b1 0.027%	0.867 b1 0.001	0.593 b1 0.035	3.222 b1 0.136
Gen3D	-8.686 b1 0.450	-9.285 b1 0.584	-9.832 b1 0.324	100.0 b1 0.0%	0.870 b1 0.006	0.701 b1 0.016	3.450 b1 0.120
GA+D	-7.487 b1 0.757	-8.305 b1 0.803	-8.760 b1 0.796	99.2 b1 0.011%	0.834 b1 0.035	0.405 b1 0.024	5.024 b1 0.164
Graph-GA	-10.848 b1 0.860	-11.702 b1 0.930	-12.302 b1 1.010	100.0 b1 0.0%	0.811 b1 0.037	0.456 b1 0.067	3.503 b1 0.367
Autogrow 4.0	-11.371 b1 0.398	-12.213 b1 0.623	-12.474 b1 0.839	100.0 b1 0.0%	0.852 b1 0.011	0.748 b1 0.022	2.497 b1 0.049
RGA (ours)	-11.867 b1 0.170	-12.564 b1 0.287	-12.869 b1 0.473	100.0 b1 0.0%	0.857 b1 0.020	0.742 b1 0.036	2.473 b1 0.048
RGA -pretrain	-11.443 b1 0.219	-12.424 b1 0.386	-12.435 b1 0.654	100.0 b1 0.0%	0.854 b1 0.035	0.750 b1 0.034	2.494 b1 0.043
RGA - KT	-11.434 b1 0.169	-12.437 b1 0.354	-12.502 b1 0.603	100.0 b1 0.0%	0.853 b1 0.028	0.738 b1 0.034	2.501 b1 0.050

RGA在所评估靶点上实现了最佳的TOP-100、TOP-10和TOP-1对接分数。
RGA在五次独立运行中的方差更小，表明随机游走行为被抑制。
在多样靶点上进行知识转移和预训练，进一步提升top-k对接分数的表现。
与Autogrow 4.0相比，RGA由于学习到的动作引导和更长距离的导航，提供了更优的对接性能。
长距离交叉导航优于仅关注局部修改的RL方法，体现了结构信息驱动搜索的优势。
该方法保持具有竞争力的QED和SA分数，表明结构质量和可合成性合理。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。