QUICK REVIEW

[论文解读] A Multi-task Large Reasoning Model for Molecular Science

Pengfei Liu, Shuang Ge|arXiv (Cornell University)|Mar 13, 2026

Machine Learning in Materials Science被引用 0

一句话总结

该论文提出一个具有多专家架构和链式推理的多任务大规模推理模型，通过强化学习实现数据高效学习，在分子任务上展现出强大的多任务性能。

ABSTRACT

Advancements in artificial intelligence for molecular science are necessitating a paradigm shift from purely data-driven predictions to knowledge-guided computational reasoning. Existing molecular models are predominantly proprietary, lacking general molecular intelligence and generalizability. This underscores the necessity for computational methods that can effectively integrate scientific logic with deep learning architectures. Here we introduce a multi-task large reasoning model designed to emulate the cognitive processes of molecular scientists through structured reasoning and reflection. Our approach incorporates multi-specialist modules to provide versatile molecular expertise and a chain-of-thought (CoT) framework enhanced by reinforcement learning infused with molecular knowledge, enabling structured and reflective reasoning. Systematic evaluations across 10 molecular tasks and 47 metrics demonstrate that our model achieves an average 50.3% improvement over the base architecture, outperforming over 20 state-of-the-art baselines, including ultra-large-parameter foundation models, despite using significantly fewer training data and computational resources. This validates that embedding explicit reasoning mechanisms enables high-efficiency learning, allowing smaller-scale models to surpass massive counterparts in both efficacy and interpretability. The practical utility of this computational framework was validated through a case study on the design of central nervous system (CNS) drug candidates, illustrating its capacity to bridge data-driven and knowledge-integrated approaches for intelligent molecular design.

研究动机与目标

将化学知识融入深度学习以处理超越纯预测的分子任务的动机。
开发一个多专家、任务自适应的框架，将化学逻辑嵌入到链式推理（CoT）中。
通过结合数据协同和专家协同以及强化学习实现数据高效学习。
在十个分子任务上以有限的训练数据与资源实现卓越的多任务性能。
通过一个 CNS 药物设计案例研究展示其实用性，连接生成、预测与合成。

提出的方法

在预训练 LLM（DeepSeek-7B 基座）内部构建一个多专家层，并配备路由器以按任务类型协调八个专家组。
在 93K 指令数据集上训练预测专家，在 3.5K 高质量 CoT 数据集上训练推理（CoT）专家。
引入低秩适应（LoRA）以实现高效的参数更新。
应用带有任务特定分子科学奖励的强化学习，以使推理与化学有效性对齐。
采用三步训练：通过对 74.5K 数据的指令微调进行表征学习、对 3.6K 数据的 CoT 微调以及知识对齐的 RL。
利用数据协同（相关任务的联合训练）和专家协同（预测 + 推理专家协作）来提升推理能力。

Figure 1: Overview of the reasoning framework. (a) Current LLMs in multi-task molecular science tasks, showcasing their core capabilities and inherent challenges. (b) Molecular multi-task reasoning framework, detailing the process from user query to inference, featuring tokenization and embedding, s

实验结果

研究问题

RQ1一个多任务分子推理模型是否可以通过将化学知识嵌入链式推理来超越现有最先进基线，在多样化任务中表现优异？
RQ2数据协同和专家协同对多任务分子性能与推理对齐的影响是什么？
RQ3具有知识引导奖励的强化学习如何影响预测专家与推理专家之间的一致性？
RQ4在 CNS 药物设计场景中，是否可通过较小、知识融入的模型实现高准确性与可解释推理？

主要发现

在 10 个分子任务上，相比基础架构平均提升 50.3%。
超越超过 20 种最先进基线，包括超大规模参数模型，同时使用更少的训练数据和资源。
在任务指标上对强大多任务模型 LLaSMol 提升近 6%。
通过链式推理及 CNS 药物设计案例研究，展示鲁棒的推理可解释性。
数据与专家协同 plus CoT RL 相较仅指令或仅 CoT 的变体，显著提升了性能。
显示脂化性作为一个任务，模型略落后于基线，表明专业化存在一定局限性。

Figure 2: Comprehensive evaluation of model performance and synergy. (a) Overall performance against baselines, comparing our model with over ten representative LLMs across core metrics as detailed in (b), with all radar plot metrics normalized. (b) Detailed metric values across the ten tasks. (c) A

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。