QUICK REVIEW

[論文レビュー] Reinforced Genetic Algorithm for Structure-based Drug Design

Tianfan Fu, Wenhao Gao|arXiv (Cornell University)|Nov 28, 2022

Computational Drug Discovery Methods被引用数 27

ひとこと要約

RLを用いた強化学習と遺伝的アルゴリズムを3Dの標的-リガンド構造で導くRGAは、構造ベースの薬剤設計におけるドッキングベースの最適化を改善し、ターゲット間の事前学習と知識移転により性能と頑健性を向上させる。

ABSTRACT

Structure-based drug design (SBDD) aims to discover drug candidates by finding molecules (ligands) that bind tightly to a disease-related protein (targets), which is the primary approach to computer-aided drug discovery. Recently, applying deep generative models for three-dimensional (3D) molecular design conditioned on protein pockets to solve SBDD has attracted much attention, but their formulation as probabilistic modeling often leads to unsatisfactory optimization performance. On the other hand, traditional combinatorial optimization methods such as genetic algorithms (GA) have demonstrated state-of-the-art performance in various molecular optimization tasks. However, they do not utilize protein target structure to inform design steps but rely on a random-walk-like exploration, which leads to unstable performance and no knowledge transfer between different tasks despite the similar binding physics. To achieve a more stable and efficient SBDD, we propose Reinforced Genetic Algorithm (RGA) that uses neural models to prioritize the profitable design steps and suppress random-walk behavior. The neural models take the 3D structure of the targets and ligands as inputs and are pre-trained using native complex structures to utilize the knowledge of the shared binding physics from different targets and then fine-tuned during optimization. We conduct thorough empirical studies on optimizing binding affinity to various disease targets and show that RGA outperforms the baselines in terms of docking scores and is more robust to random initializations. The ablation study also indicates that the training on different targets helps improve performance by leveraging the shared underlying physics of the binding processes. The code is available at https://github.com/futianfan/reinforced-genetic-algorithm.

研究の動機と目的

構造ベースの薬剤設計における従来のGAの非効率性と不安定性をタンパク質構造情報を取り入れて解決する。
進化過程を進化的マルコフ決定過程（EMDP）として再定式化し、強化学習を可能にする。
3D構造データを用いて交叉と突然変異を導くターゲット-リガンド等変性ニューラルネットワークを開発する。
nativeなタンパク質-リガンド複合体でモデルを事前学習し、ターゲット間で知識移転を可能にして共有結合の物理を捉える。
SARS-CoV-2主プロテアーゼを含む複数の疾病ターゲットに対して、ドッキングスコアと頑健性の改善を実証する。

提案手法

GAをPopulationレベルの状態とドッキングスコアベースの報酬を持つEMDPとしてモデル化する。
交叉を導く2つのポリシー Network（2段階の親選択）と、突然変異を導く2つのポリシー Network（親選択と反応選択）を用いる。
E(3)-等変性ニューラルネットワークを用いて標的-リガンド複合体を処理し、アクション確率を出力する。
CrossDocked2020データを用いて3D結合親和性タスクでENNを事前学習し、共通の結合物理を捉え、それから最適化中に微調整する。
ポリシーをポリシー勾配（REINFORCE）で最適化し、期待されるドッキングスコアの改善を最大化する。
AutoDock Vinaをドッキングオラクルとして使用し、合成可能性を保証するように化学的に意味のある単体および二量体反応で変異を設計する。

実験結果

リサーチクエスチョン

RQ1強化学習によるガイド付きGAは、ドッキングスコア最適化においてベースラインの構造ベース設計法を上回るか。
RQ2ターゲット構造情報を活用することでランダム性を低減し、複数回の実行で頑健性を向上させるか。
RQ3native複合体での事前学習とターゲット間の知識移転がSBDDの性能を向上させるか。
RQ4長距離の交叉を組み込むことは、局所的な変異のみのRL法と比較して最適化にどのような影響を与えるか。

主な発見

RGAは評価対象のターゲット全体でTOP-100、TOP-10、TOP-1のドッキングスコアで最高を達成する。
RGAは5回の独立実行で分散が減少し、ランダムウォーク挙動を抑制することを示す。
知識移転と多様なターゲットでの事前学習は、トップkのドッキングスコアの性能をさらに向上させる。
Autogrow 4.0と比較して、RGAは学習された行動ガイダンスと長距離ナビゲーションにより優れたドッキング性能を提供する。
長距離の交叉ナビゲーションは、局所的な修正に焦点を当てるRL法よりも優れており、構造情報を反映した探索の利点を示す。
アプローチはQEDとSAスコアを競合的に維持しており、適切な構造品質と合成可能性を示唆する。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。