QUICK REVIEW

[论文解读] Graph Convolutional Policy Network for Goal-Directed Molecular Graph Generation

Jiaxuan You, Bowen Liu|arXiv (Cornell University)|Jun 7, 2018

Machine Learning in Materials Science参考文献 43被引用 246

一句话总结

GCPN 使用图卷积、强化学习和对抗训练来生成分子图，同时在遵守化学规则的前提下优化性质，显著超越基线，取得了最先进的提升。

ABSTRACT

Generating novel graph structures that optimize given objectives while obeying some given underlying rules is fundamental for chemistry, biology and social science research. This is especially important in the task of molecular graph generation, whose goal is to discover novel molecules with desired properties such as drug-likeness and synthetic accessibility, while obeying physical laws such as chemical valency. However, designing models to find molecules that optimize desired properties while incorporating highly complex and non-differentiable rules remains to be a challenging task. Here we propose Graph Convolutional Policy Network (GCPN), a general graph convolutional network based model for goal-directed graph generation through reinforcement learning. The model is trained to optimize domain-specific rewards and adversarial loss through policy gradient, and acts in an environment that incorporates domain-specific rules. Experimental results show that GCPN can achieve 61% improvement on chemical property optimization over state-of-the-art baselines while resembling known molecules, and achieve 184% improvement on the constrained property optimization task.

研究动机与目标

将分子表示为图，以实现化学感知的生成与验证。
开发基于RL的策略网络，迭代构建分子图。
通过对抗训练引入严格的化学约束和先验知识。
通过引导奖励优化领域特定的分子性质。
提供端到端可训练的框架与策略梯度。

提出的方法

直接将分子表示为分子图，邻接矩阵 A、节点特征 F 和边类型 E。
将图生成表述为一个马尔可夫决策过程，状态为中间图，动作为键/子结构的添加。
使用图卷积网络在扩展图 G_t ∪ C 上计算节点嵌入以进行动作预测。
通过四分量策略预测动作：第一个节点、第二个节点、边类型和终止决策。
利用近端策略优化（PPO）进行训练，以同时优化域特定属性和来自判别器 D_φ 的对抗奖励的组合奖励。
包含专家预训练以模仿已知分子，以及对抗训练以使生成的图更接近数据集的真实感。

实验结果

研究问题

RQ1GCPN 是否能够在遵守价、以及其他化学规则的前提下，生成优化目标性质的分子？
RQ2基于图的生成结合RL和对抗训练在性质优化与定向方面，与基于文本或非图的基线相比如何？
RQ3对抗训练是否在不牺牲优化性能的前提下改善生成分子的真实感和多样性？
RQ4GCPN 学到的策略是否能推广到跨越多样起始分子的一致性性质约束优化？
RQ5在分子图生成中，探索（RL）与真实感（对抗/专家训练）之间的权衡是什么？

主要发现

GCPN 在化学性质优化方面相较最佳基线（JT-VAE）提升了61%。
GCPN 在受约束的性质优化任务上相较基线提升了184%。
GCPN 通过基于图的价规检查和对抗训练实现100%有效性并高度贴近真实分子。
GCPN 在属性优化和定向任务上优于 ORGAN 和 JT-VAE。
GCPN 在属性定向方面展示出高成功率和生成分子的多样性。
与潜在空间方法相比，GCPN 在保持性能的同时实现对训练数据集之外的外推。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。