QUICK REVIEW

[论文解读] MatLLMSearch: Crystal Structure Discovery with Evolution-Guided Large Language Models

Jingru Gan, Peichen Zhong|ArXiv.org|Feb 28, 2025

Machine Learning in Materials Science被引用 4

一句话总结

MatLLMSearch 使用预训练大模型并通过进化搜索来生成热力学稳定的晶体结构，无需微调，在稳定性和亚稳性方面比基线更高，同时减少训练开销。

ABSTRACT

Crystal structure generation is fundamental to materials science, enabling the discovery of novel materials with desired properties. While existing approaches leverage Large Language Models (LLMs) through extensive fine-tuning on materials databases, we show that pre-trained LLMs can inherently generate novel and stable crystal structures without additional fine-tuning. Our framework employs LLMs as intelligent proposal agents within an evolutionary pipeline that guides them to perform implicit crossover and mutation operations while maintaining chemical validity. We demonstrate that MatLLMSearch achieves a 78.38% metastable rate validated by machine learning interatomic potentials and 31.7% DFT-verified stability, outperforming specialized models such as CrystalTextLLM. Beyond crystal structure generation, we further demonstrate that our framework adapts to diverse materials design tasks, including crystal structure prediction and multi-objective optimization of properties such as deformation energy and bulk modulus, all without fine-tuning. These results establish our framework as a versatile and effective framework for consistent high-quality materials discovery, offering training-free generation of novel stable structures with reduced overhead and broader accessibility.

研究动机与目标

证明预训练的大模型可以在不进行微调的情况下生成热力学稳定的晶体结构。
将基于大模型的再现与进化选择结合起来以探索晶体结构空间。
使用 MLIPs 和 DFT 验证稳定性预测，并与最先进的基线进行比较。
展示对晶体结构预测和多目标材料设计的适应性。

提出的方法

从已知稳定结构中形成初始群体。
提示大模型执行隐式分叉和变异以产生后代。
用 CHGNet 放松并评估后代，计算分解能 E_d 和目标属性。
基于来自父代、后代及可选额外池的目标分数选择最佳候选。
对最有前景的结构进行最终的DFT（VASP）验证。

实验结果

研究问题

RQ1预训练的大模型是否可以在不进行微调的情况下生成热力学稳定的晶体结构？
RQ2在发现稳定且多样的晶体结构方面，基于大模型的进化循环有多有效？
RQ3额外参考池对亚稳性和经过DFT 验证的稳定性有何影响？
RQ4该框架能否扩展到晶体结构预测和性质的多目标优化？
RQ5MatLLMSearch 与像 CrystalTextLLM 这样的微调基线在稳定性指标上相比如何？

主要发现

MatLLMSearch 通过 CHGNet 实现 78.38% 的亚稳态率，DFT 验证稳定性为 31.7%，在同等模型规模下优于 CrystalTextLLM。
在使用数千个参考结构的情况下，该方法超越基线，同时避免大规模微调。
从父代中排除含 f 电子的结构后，亚稳性提升至 78.4%，DFT 验证稳定性为 27.0%；且不含 f 电子的稳定结构上升至 24.6%。
该方法支持晶体结构预测和多目标优化（例如在稳定性和体积模量之间取得平衡）。
通过提示引导的隐式跨越/变异使大模型能够在不同晶系中产生多样的结构模体，除了稳定性评估之外几乎没有额外计算开销。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。