[论文解读] Guiding Generative Protein Language Models with Reinforcement Learning
该论文提出一个强化学习框架,迭代地引导生成蛋白语言模型朝向用户定义的目标,从而设计出具有所需属性的蛋白质,如结合亲和力,在两个迭代中在EGFR结合体上实现了26倍的亲和力提升。
Protein language models (pLMs) have demonstrated success at generating functional proteins across vast sequence spaces but lack the ability to design high-fitness variants on demand. Here, we iteratively guide pLMs toward user-defined objectives by applying reinforcement learning (RL). We demonstrate that RL can steer pLMs toward various protein properties, such as topologies or binding affinities, in a few iterations through long evolutionary trajectories. We apply our framework to the design of epidermal growth factor receptor (EGFR) binders, achieving a 26-fold increase in binding affinity in two iterations.
研究动机与目标
- Motivate the need for controllable generation of functional proteins from large sequence spaces.
- Introduce an RL-based framework to guide protein language models toward specific design objectives.
- Demonstrate the ability to steer generated sequences toward properties such as topology and binding affinity.
- Showcase practical design outcomes on epidermal growth factor receptor (EGFR) binders.
提出的方法
- Apply reinforcement learning to guide generative protein language models toward user-defined objectives.
- Leverage RL to optimize for protein properties across long evolutionary trajectories in a few iterations.
- Demonstrate steering of pLMs toward target attributes such as binding affinity and topology.
- Evaluate iterative design progress through generated protein sequences and properties.
- Present empirical results on EGFR binder design showing large affinity gains.
实验结果
研究问题
- RQ1Can reinforcement learning effectively steer generative protein language models toward predefined design objectives?
- RQ2How many iterations and what trajectory are required to achieve high-fitness protein variants?
- RQ3What property targets (e.g., binding affinity, topology) are amenable to RL-guided design?
- RQ4How does RL-guided design perform on a concrete target such as EGFR binders?
主要发现
- RL can direct generative protein language models toward chosen properties within a few iterations.
- The framework enables long evolutionary trajectories to improve protein design objectives.
- Applied to EGFR binders, the method achieves a 26-fold increase in binding affinity over two iterations.
- The approach demonstrates adaptability to multiple protein properties beyond affinity.
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。