QUICK REVIEW

[论文解读] Retro: Learning Retrosynthetic Planning with Neural Guided A Search

Binghong Chen, Chengtao Li|arXiv (Cornell University)|Jun 29, 2020

Asymmetric Hydrogenation and Catalysis被引用 42

一句话总结

Retro* 是一种神经引导的、类似 A* 的反向合成规划算法，它使用 AND-OR 树和离线学习的价值估计来高效找到高质量的合成路线，在基于 USPTO 的基准测试中优于现有方法。

ABSTRACT

Retrosynthetic planning is a critical task in organic chemistry which identifies a series of reactions that can lead to the synthesis of a target product. The vast number of possible chemical transformations makes the size of the search space very big, and retrosynthetic planning is challenging even for experienced chemists. However, existing methods either require expensive return estimation by rollout with high variance, or optimize for search speed rather than the quality. In this paper, we propose Retro*, a neural-based A*-like algorithm that finds high-quality synthetic routes efficiently. It maintains the search as an AND-OR tree, and learns a neural search bias with off-policy data. Then guided by this neural network, it performs best-first search efficiently during new planning episodes. Experiments on benchmark USPTO datasets show that, our proposed method outperforms existing state-of-the-art with respect to both the success rate and solution quality, while being more efficient at the same time.

研究动机与目标

在庞大的搜索空间中推动高效、高质量的多步逆向合成规划。
开发一个神经引导的单人搜索（AND-OR 树），使扩展偏向有前景的路线。
从规划数据离线学习 V_m 值，以指导在线搜索并提升效率和质量。
提供基准数据集和评估多步逆向合成方法的指标，无需专家判断。

提出的方法

将逆向合成为一个具有分子节点（OR）和反应节点（AND）的 AND-OR 搜索树来表示。
采用类似 A*- 的搜索，其中节点选择依赖于分解为 g_t 与 h_t 分量的学习得到的值函数 V_t(m|T)。
定义反应数 rn(·|T)，并从树结构推导 V_t(m|T) 以引导向低成本路线扩展。
使用 Morgan 指纹和回归目标以及一致性项，从离线规划数据训练 V_m，以使其符合路线成本。
通过一步逆向合成模型 B 扩展前沿分子，并以自底向上、缓存的方式在整棵树中更新 V_t。
提供一个数据收集流水线，从 USPTO 数据生成合成路线，用于训练和基准测试。

实验结果

研究问题

RQ1神经引导的单人 AND-OR 搜索能否高效找到高质量的逆向合成路线？
RQ2离线学习分子成本 V_m 是否相较于非学习基线提升搜索效率和解的质量？
RQ3在基于 USPTO 的基准测试上，Retro* 相对于 DFPN-E 和基于 MCTS 的方法在成功率、路线长度和总成本方面的表现如何？

主要发现

Retro* 在测试集上实现 86.84% 的成功率，优于包括 DFPN-E+ 和 MCTS 变体在内的竞争对手。
在相同的时间预算下，Retro* 比第二佳方法（DFPN-E）多解决了 31% 的测试分子。
在 Retro* 的解中，50 条路线比专家路线短，112 条在总成本上更优。
使用 Retro*-0（V_m=0）的消融会使成功率下降约 6 个百分点，显示了学习的好处。
将学习到的 V_m 与 MCTS+ 和 DFPN-E+ 结合可提升它们的性能，说明价值函数在跨规划器中的效用。
与基线相比，Retro* 在更多时间（更多一步调用）下的成功率提升更快。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。

[论文解读] Retro*: Learning Retrosynthetic Planning with Neural Guided A* Search