QUICK REVIEW

[论文解读] Guiding Generative Protein Language Models with Reinforcement Learning

Filippo Stocco, Maria Artigues-Lleixa|arXiv (Cornell University)|Dec 17, 2024

Topic Modeling被引用 6

一句话总结

该论文提出一个强化学习框架，迭代地引导生成蛋白语言模型朝向用户定义的目标，从而设计出具有所需属性的蛋白质，如结合亲和力，在两个迭代中在EGFR结合体上实现了26倍的亲和力提升。

ABSTRACT

Protein language models (pLMs) have demonstrated success at generating functional proteins across vast sequence spaces but lack the ability to design high-fitness variants on demand. Here, we iteratively guide pLMs toward user-defined objectives by applying reinforcement learning (RL). We demonstrate that RL can steer pLMs toward various protein properties, such as topologies or binding affinities, in a few iterations through long evolutionary trajectories. We apply our framework to the design of epidermal growth factor receptor (EGFR) binders, achieving a 26-fold increase in binding affinity in two iterations.

研究动机与目标

Motivate the need for controllable generation of functional proteins from large sequence spaces.
Introduce an RL-based framework to guide protein language models toward specific design objectives.
Demonstrate the ability to steer generated sequences toward properties such as topology and binding affinity.
Showcase practical design outcomes on epidermal growth factor receptor (EGFR) binders.

提出的方法

Apply reinforcement learning to guide generative protein language models toward user-defined objectives.
Leverage RL to optimize for protein properties across long evolutionary trajectories in a few iterations.
Demonstrate steering of pLMs toward target attributes such as binding affinity and topology.
Evaluate iterative design progress through generated protein sequences and properties.
Present empirical results on EGFR binder design showing large affinity gains.

实验结果

研究问题

RQ1Can reinforcement learning effectively steer generative protein language models toward predefined design objectives?
RQ2How many iterations and what trajectory are required to achieve high-fitness protein variants?
RQ3What property targets (e.g., binding affinity, topology) are amenable to RL-guided design?
RQ4How does RL-guided design perform on a concrete target such as EGFR binders?

主要发现

RL can direct generative protein language models toward chosen properties within a few iterations.
The framework enables long evolutionary trajectories to improve protein design objectives.
Applied to EGFR binders, the method achieves a 26-fold increase in binding affinity over two iterations.
The approach demonstrates adaptability to multiple protein properties beyond affinity.

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。