QUICK REVIEW

[论文解读] Saturn: Sample-efficient Generative Molecular Design using Memory Manipulation

Jeff Guo, Philippe Schwaller|arXiv (Cornell University)|May 27, 2024

Advanced biosensing and bioanalysis techniques被引用 6

一句话总结

Saturn 将 Mamba SSM 与 Augmented Memory 和 SMILES 增强相结合，在固定 oracle 预算下实现面向目标的分子设计的最先进样本效率，在 MPO 对接任务上优于众多基线。

ABSTRACT

Generative molecular design for drug discovery has very recently achieved a wave of experimental validation, with language-based backbones being the most common architectures employed. The most important factor for downstream success is whether an in silico oracle is well correlated with the desired end-point. To this end, current methods use cheaper proxy oracles with higher throughput before evaluating the most promising subset with high-fidelity oracles. The ability to directly optimize high-fidelity oracles would greatly enhance generative design and be expected to improve hit rates. However, current models are not efficient enough to consider such a prospect, exemplifying the sample efficiency problem. In this work, we introduce Saturn, which leverages the Augmented Memory algorithm and demonstrates the first application of the Mamba architecture for generative molecular design. We elucidate how experience replay with data augmentation improves sample efficiency and how Mamba synergistically exploits this mechanism. Saturn outperforms 22 models on multi-parameter optimization tasks relevant to drug discovery and may possess sufficient sample efficiency to consider the prospect of directly optimizing high-fidelity oracles.

研究动机与目标

在严格样本预算下，直接优化高保真度 oracle 的必要性在生成性分子设计中的动机。
研究带有数据增强的 Augmented Memory 如何在不同架构中提升样本效率。
探索高级骨干网络（RNN、解码器 Transformer、Mamba SSM）在 MPO 任务中的优势。
展示 Saturn 在基于对接的 MPO 任务中的性能提升及其对物理学基础 oracle 的可迁移性。

提出的方法

将分子建模为 SMILES，并将生成框架视为以语言模型骨干为基础的强化学习。
应用带有 SMILES 增强的 Augmented Memory，将代理推向高奖励序列。
使用前 top SMILES 的回放缓冲区，进行增强并重复利用，通过增强后概率与代理概率之间的平方误差损失（Eq. 4）来更新代理。
将遗传算法作为可选的父代群体以刷新缓冲区并缓解模式塌陷。
缓存 oracle 评估以避免重复昂贵的评估，并应用多样性过滤器以防止骨架的过度代表性。
评估从 RNN 到解码器 Transformer，再到 Mamba SSM 的骨干网络，采用固定的 MPO 目标与 oracle 预算。

Figure 1: Saturn generative workflow. All generated SMILES and their rewards are stored in the Oracle Cache after canonicalization. A genetic algorithm can be optionally applied using the replay buffer as the parent population. Augmented Memory is used to update the agent numerous times.

实验结果

研究问题

RQ1在固定预算内，基于记忆的增强和经验回放是否能够实现对高保真 oracle 的直接优化？
RQ2骨干网络架构（RNN、解码器 Transformer、Mamba）对目标导向分子设计的样本效率有何影响？
RQ3将 Augmented Memory、数据增强与 Mamba 的组合在对接和物理基础任务上是否优于基线，能否获得更好的 MPO 性能？
RQ4Saturn 如何将其样本效率转移到不同的生物靶标和基于对接的目标？

主要发现

模型	增强轮次	产出率 (↑)	IntDiv1 (↑)	骨架数 (↑)	OB 1 (↓)	OB 10 (↓)	OB 100 (↓)	重复
RNN	5	107±58	0.814±0.036	101±54	480±118 (10)	721±109 (10)	916±53 (4)	7±7
RNN	6	121±80	0.791±0.040	107±68	493±214 (10)	713±15 (10)6	895±107 (5)	12±11
RNN	7	144±107	0.776±0.026	117±86	467±186 (10)	684±136 (10)	871±116 (6)	38±82
RNN	8	120±95	0.734±0.128	104±85	481±288 (10)	653±145 (8)	854±54 (5)	18±28
RNN	9	141±104	0.783±0.048	112±72	453±211 (10)	654±154 (9)	871±104 (6)	59±95
RNN	10	106±76	0.76±0.056	84±63	510±201 (10)	733±122 (9)	913±64 (5)	43±47
Decoder	5	154±93	0.748±0.052	122±70	439±151 (10)	679±128 (10)	907±92 (8)	90±90
Decoder	6	116±94	0.748±0.039	86±64	517±165 (10)	728±158 (10)	904±126 (5)	73±42
Decoder	7	108±85	0.747±0.051	71±50	510±222 (10)	740±127 (9)	868±48 (4)	126±63
Decoder	8	108±94	0.708±0.109	72±57	538±164 (10)	742±116 (9)	887±87 (4)	150±72
Decoder	9	78±83	0.687±0.116	51±55	614±244 (10)	790±150 (8)	890±62 (3)	242±139
Decoder	10	120±128	0.691±0.042	74±73	663±170 (9)	768±169 (8)	805±65 (4)	344±218
Mamba	5	69±38	0.764±0.052	54±28	542±93 (10)	807±76 (10)	988±17 (3)	178±90
Mamba	6	138±46	0.759±0.039	110±42	456±89 (10)	693±75 (10)	919±36 (7)	286±137
Mamba	7	174±95	0.737±0.059	127±83	427±177 (10)	643±102 (10)	858±77 (7)	395±147
Mamba	8	209±95	0.751±0.030	137±60	461±151 (10)	617±135 (10)	817±71 (8)	482±214
Mamba	9	202±98	0.735±0.032	137±80	389±112 (10)	631±102 (10)	841±92 (8)	518±237
Mamba	10	306±57	0.714±0.035	206±34	387±148 (10)	555±66 (10)	761±58 (10)	1110±636

Saturn 采用 Mamba、Augmented Memory 与 SMILES 增强，在固定 oracle 预算下实现出色的样本效率，并在 MPO 对接任务上优于 22 个模型。
增强记忆将增强的 SMILES 推向高奖励区域，对更不可能的序列进行更大的更新，从而实现高效学习。
Mamba 展现跳跃式的局部探索行为，沿着方向性遍历化学空间并产生局部相似的分子，从而提高效率。
在所有架构中，Mamba 在 Yield 和 Oracle Burden 指标上在 1,000 oracle 预算下持续超越 RNN 和解码器 Transformer 的基线。
Saturn 将样本效率转移到针对 DRD2、AChE 和 MK2 靶标的物理基础对接 MPO 上，常常优于原生的 Augmented Memory，并显示 GA 可以恢复多样性。
与 GEAM 在 HIT/Novel hit 基准上的比较，Saturn（带 Saturn-GA）在某些情况下取得了可比甚至更优的结果，方差更低，并且能在更少的 oracle 调用下找到严格筛选命中。

Figure 2: a. Average maximum token probability across agent states. Augmentation pushes the agent action distribution towards a delta distribution. b. Augmented Memory (10 augmentation rounds) makes the likelihood of generating SMILES in the buffer more likely. c. Top: On average, augmented forms of

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。