Skip to main content
QUICK REVIEW

[论文解读] Guiding Generative Protein Language Models with Reinforcement Learning

Filippo Stocco, Maria Artigues-Lleixa|arXiv (Cornell University)|Dec 17, 2024
Topic Modeling被引用 6
一句话总结

该论文提出一个强化学习框架,迭代地引导生成蛋白语言模型朝向用户定义的目标,从而设计出具有所需属性的蛋白质,如结合亲和力,在两个迭代中在EGFR结合体上实现了26倍的亲和力提升。

ABSTRACT

Protein language models (pLMs) have demonstrated success at generating functional proteins across vast sequence spaces but lack the ability to design high-fitness variants on demand. Here, we iteratively guide pLMs toward user-defined objectives by applying reinforcement learning (RL). We demonstrate that RL can steer pLMs toward various protein properties, such as topologies or binding affinities, in a few iterations through long evolutionary trajectories. We apply our framework to the design of epidermal growth factor receptor (EGFR) binders, achieving a 26-fold increase in binding affinity in two iterations.

研究动机与目标

  • Motivate the need for controllable generation of functional proteins from large sequence spaces.
  • Introduce an RL-based framework to guide protein language models toward specific design objectives.
  • Demonstrate the ability to steer generated sequences toward properties such as topology and binding affinity.
  • Showcase practical design outcomes on epidermal growth factor receptor (EGFR) binders.

提出的方法

  • Apply reinforcement learning to guide generative protein language models toward user-defined objectives.
  • Leverage RL to optimize for protein properties across long evolutionary trajectories in a few iterations.
  • Demonstrate steering of pLMs toward target attributes such as binding affinity and topology.
  • Evaluate iterative design progress through generated protein sequences and properties.
  • Present empirical results on EGFR binder design showing large affinity gains.

实验结果

研究问题

  • RQ1Can reinforcement learning effectively steer generative protein language models toward predefined design objectives?
  • RQ2How many iterations and what trajectory are required to achieve high-fitness protein variants?
  • RQ3What property targets (e.g., binding affinity, topology) are amenable to RL-guided design?
  • RQ4How does RL-guided design perform on a concrete target such as EGFR binders?

主要发现

  • RL can direct generative protein language models toward chosen properties within a few iterations.
  • The framework enables long evolutionary trajectories to improve protein design objectives.
  • Applied to EGFR binders, the method achieves a 26-fold increase in binding affinity over two iterations.
  • The approach demonstrates adaptability to multiple protein properties beyond affinity.

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。