QUICK REVIEW

[논문 리뷰] Saturn: Sample-efficient Generative Molecular Design using Memory Manipulation

Jeff Guo, Philippe Schwaller|arXiv (Cornell University)|2024. 05. 27.

Advanced biosensing and bioanalysis techniques인용 수 6

한 줄 요약

Saturn은 Mamba SSM과 Augmented Memory 및 SMILES 보강을 결합하여 고정 오라클 예산 하에서 목표 지향적 분자 설계에 대한 최첨단 샘플 효율성을 달성하고 MPO 도킹 태스크에서 다수의 baselines를 능가합니다.

ABSTRACT

Generative molecular design for drug discovery has very recently achieved a wave of experimental validation, with language-based backbones being the most common architectures employed. The most important factor for downstream success is whether an in silico oracle is well correlated with the desired end-point. To this end, current methods use cheaper proxy oracles with higher throughput before evaluating the most promising subset with high-fidelity oracles. The ability to directly optimize high-fidelity oracles would greatly enhance generative design and be expected to improve hit rates. However, current models are not efficient enough to consider such a prospect, exemplifying the sample efficiency problem. In this work, we introduce Saturn, which leverages the Augmented Memory algorithm and demonstrates the first application of the Mamba architecture for generative molecular design. We elucidate how experience replay with data augmentation improves sample efficiency and how Mamba synergistically exploits this mechanism. Saturn outperforms 22 models on multi-parameter optimization tasks relevant to drug discovery and may possess sufficient sample efficiency to consider the prospect of directly optimizing high-fidelity oracles.

연구 동기 및 목표

엄격한 샘플 예산 하에서 생성적 분자 설계에서 고충실도 오라클을 직접 최적화할 필요성에 대한 동기 부여.
Augmented Memory와 데이터 증강이 아키텍처 전반에서 샘플 효율성을 개선하는지 조사합니다.
MPO 태스크를 위한 고급 백본(RNN, 디코더 Transformer, Mamba SSM)의 이점을 탐구합니다.
Saturn의 도킹 기반 MPO 태스크 성능 향상 및 물리 기반 오라클로의 이식 가능성을 보여줍니다.]
method:[
모든 요소가 자연어로 구성된 텍스트를 한국어로 번역해야 합니다. 단, 수치, 식, 표의 셀 값, 고유 명사는 원문 그대로 유지합니다.

제안 방법

분자를 SMILES로 모델링하고 생성 프레임을 언어 모델 백본을 가진 강화 학습으로 프레이밍합니다.
Augmented Memory를 SMILES 보강과 함께 적용하여 에이전트를 높은 보상 시퀀스로 이끕니다.
상위 SMILES의 재생 버퍼를 보강 및 재사용하여 보강된 가능도와 에이전트 가능도 간의 제곱 오차 손실(Eq. 4)로 에이전트를 업데이트합니다.
모드 붕괴를 완화하기 위해 버퍼를 새로 고침하는 옵션으로 유전 알고리즘을 부모 인구로 도입합니다.
반복적으로 비싼 평가를 피하기 위해 오라클 평가를 캐시하고, 골격의 과다 대표를 방지하기 위해 다양성 필터를 적용합니다.
고정 MPO 목표 및 오라클 예산 하에서 RNN에서 디코더 Transformer까지 백본을 평가합니다.

Figure 1: Saturn generative workflow. All generated SMILES and their rewards are stored in the Oracle Cache after canonicalization. A genetic algorithm can be optionally applied using the replay buffer as the parent population. Augmented Memory is used to update the agent numerous times.

실험 결과

연구 질문

RQ1메모리 기반 보강과 경험 재생이 고정 예산 내에서 고충실도 오라클의 직접 최적화를 가능하게 합니까?
RQ2백본 아키텍처(RNN, 디코더 Transformer, Mamba)가 목표 지향적 분자 설계의 샘플 효율성에 미치는 영향은 무엇입니까?
RQ3Augmented Memory, 데이터 증강, Mamba의 조합이 도킹 및 물리 기반 태스크에서 baselines보다 우수한 MPO 성능을 얻어내나요?
RQ4Saturn이 다양한 생물학적 목표와 도킹 기반 목적에 샘플 효율성을 어떻게 이식합니까?

주요 결과

모델	Aug. 라운드	수율 (↑)	IntDiv1 (↑)	스캐폴드 (↑)	OB 1 (↓)	OB 10 (↓)	OB 100 (↓)	반복
RNN	5	107±58	0.814±0.036	101±54	480±118 (10)	721±109 (10)	916±53 (4)	7±7
RNN	6	121±80	0.791±0.040	107±68	493±214 (10)	713±15 (10)6	895±107 (5)	12±11
RNN	7	144±107	0.776±0.026	117±86	467±186 (10)	684±136 (10)	871±116 (6)	38±82
RNN	8	120±95	0.734±0.128	104±85	481±288 (10)	653±145 (8)	854±54 (5)	18±28
RNN	9	141±104	0.783±0.048	112±72	453±211 (10)	654±154 (9)	871±104 (6)	59±95
RNN	10	106±76	0.76±0.056	84±63	510±201 (10)	733±122 (9)	913±64 (5)	43±47
Decoder	5	154±93	0.748±0.052	122±70	439±151 (10)	679±128 (10)	907±92 (8)	90±90
Decoder	6	116±94	0.748±0.039	86±64	517±165 (10)	728±158 (10)	904±126 (5)	73±42
Decoder	7	108±85	0.747±0.051	71±50	510±222 (10)	740±127 (9)	868±48 (4)	126±63
Decoder	8	108±94	0.708±0.109	72±57	538±164 (10)	742±116 (9)	887±87 (4)	150±72
Decoder	9	78±83	0.687±0.116	51±55	614±244 (10)	790±150 (8)	890±62 (3)	242±139
Decoder	10	120±128	0.691±0.042	74±73	663±170 (9)	768±169 (8)	805±65 (4)	344±218
Mamba	5	69±38	0.764±0.052	54±28	542±93 (10)	807±76 (10)	988±17 (3)	178±90
Mamba	6	138±46	0.759±0.039	110±42	456±89 (10)	693±75 (10)	919±36 (7)	286±137
Mamba	7	174±95	0.737±0.059	127±83	427±177 (10)	643±102 (10)	858±77 (7)	395±147
Mamba	8	209±95	0.751±0.030	137±60	461±151 (10)	617±135 (10)	817±71 (8)	482±214
Mamba	9	202±98	0.735±0.032	137±80	389±112 (10)	631±102 (10)	841±92 (8)	518±237
Mamba	10	306±57	0.714±0.035	206±34	387±148 (10)	555±66 (10)	761±58 (10)	1110±636

Saturn은 Augmented Memory와 SMILES 보강을 갖춘 Mamba를 사용하여 고정된 오라클 예산 하에서 MPO 도킹 태스크에서 22개의 모델을 능가하는 뛰어난 샘플 효율성을 달성합니다.
Augmented Memory는 보강된 SMILES를 높은 보상 영역으로 압축하고, 덜 가능성이 높은 시퀀스에 대해 더 큰 업데이트를 수행하여 효율적인 학습을 가능하게 합니다.
Mamba는 도약-로컬 탐색 행동을 보이며 화학 공간을 방향적으로 탐색하고 국소적으로 유사한 분자를 생성하여 효율성을 향상시킵니다.
아키텍처 전반에서 Mamba는 1,000 오라클 예산 하에서 Yield와 Oracle Burden 지표에서 RNN 및 디코더 Transformer 베이스라인을 지속적으로 능가합니다.
Saturn은 물리 기반 도킹 MPO를 DRD2, AChE, MK2 타깃에 대해 샘플 효율성을 이전으로 이식하며, 종종 일반 Augmented Memory를 능가하고 GA가 다양성을 회복할 수 있음을 보여줍니다.
HIT/Novel 히트 벤치마크에서 GEAM과 비교하여 Saturn(Saturn-GA)은 일부 경우에 더 낮은 분산으로 경쟁적이거나 우수한 결과를 얻으며, 엄격 필터 히트를 더 적은 오라클 호출로 찾을 수 있습니다.

Figure 2: a. Average maximum token probability across agent states. Augmentation pushes the agent action distribution towards a delta distribution. b. Augmented Memory (10 augmentation rounds) makes the likelihood of generating SMILES in the buffer more likely. c. Top: On average, augmented forms of

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.