Skip to main content
QUICK REVIEW

[论文解读] MACE-OFF: Transferable Short Range Machine Learning Force Fields for Organic Molecules

Dávid Péter Kovács, J. Harry Moore|arXiv (Cornell University)|Dec 23, 2023
Protein Structure and Dynamics被引用 65
一句话总结

MACE-OFF23 是一个可迁移的局部机器学习力场,用于有机分子,在高水平量子数据上训练,并在气体、液体、晶体和生物分子中进行高精度和高效性验证。

ABSTRACT

Classical empirical force fields have dominated biomolecular simulation for over 50 years. Although widely used in drug discovery, crystal structure prediction, and biomolecular dynamics, they generally lack the accuracy and transferability required for first-principles predictive modeling. In this paper, we introduce MACE-OFF, a series of short range transferable force fields for organic molecules created using state-of-the-art machine learning technology and first-principles reference data computed with a high level of quantum mechanical theory. MACE-OFF demonstrates the remarkable capabilities of short range models by accurately predicting a wide variety of gas and condensed phase properties of molecular systems. It produces accurate, easy-to-converge dihedral torsion scans of unseen molecules, as well as reliable descriptions of molecular crystals and liquids, including quantum nuclear effects. We further demonstrate the capabilities of MACE-OFF by determining free energy surfaces in explicit solvent, as well as the folding dynamics of peptides.Finally, we simulate a fully solvated small protein, observing accurate secondary structure and vibrational spectrum. These developments enable first-principles simulations of molecular systems for the broader chemistry community at high accuracy and relatively low computational cost.

研究动机与目标

  • 开发覆盖 H、C、N、O、F、P、S、Cl、Br、I 的有机分子可迁移、纯局部的 ML 力场。
  • 使用 SPICE 数据集,在高水平量子数据 (omegaB97M-D3(BJ)/def2-TZVPPD) 上训练,并通过更大片段和水簇进行扩充。
  • 展示对气体、液体、晶体和生物聚合物中分子内及分子间相互作用的准确预测。
  • 展示模型重现二面角扫描、晶格参数、升华焓、水结构以及肽/蛋白相关性质的能力。
  • 评估在常见分子动力学引擎(LAMMPS、OpenMM)中的计算性能与可扩展性。

提出的方法

  • 使用 MACE 架构,包含两层信息传递并具等变特征。
  • 用局部截断半径(4.5–5.0 Å)表示原子环境,并构建最高到四阶的等变乘积基。
  • 训练三种模型尺寸(S、M、L),化学通道数为 96/128/192,等变信息为 0/1/2。
  • 在 SPICE 数据(10 个元素、中性物种)基础上,加入更大片段和水簇;将力误差>2 eV/Å的离群值去除。
  • 将能量和力预测为读出函数之和(第一层:不变,第二层:MLP);力来自能量导数的解析形式。
  • 在扭转扫描(TorsionNet-500 和 双芳芳族基准)、分子晶体(振动光谱、升华焓)、水结构/动力学(径向分布函数 RDF、含量子核效应的振动光谱)、以及凝聚相液体(密度、汽化热)等方面进行评估。
Figure 1: Test set root mean square errors (RMSE). Errors in the MACE-OFF23 models compared to the underlying DFT reference data, highlighting the relative accuracy of the three models. Bottom panels show specifically inter-molecular force errors compared to overall DFT inter-molecular force magnitu
Figure 1: Test set root mean square errors (RMSE). Errors in the MACE-OFF23 models compared to the underlying DFT reference data, highlighting the relative accuracy of the three models. Bottom panels show specifically inter-molecular force errors compared to overall DFT inter-molecular force magnitu

实验结果

研究问题

  • RQ1MACE-OFF23 是否能够在广泛有机体系中达到化学精度的能量/力?
  • RQ2局部 MACE-OFF23 模型在超出训练数据的更大片段和显式溶剂环境中的泛化能力有多好?
  • RQ3与 DFT 和更高水平的量子参考相比,模型是否能准确再现二面角势垒和构型?
  • RQ4模型是否能够描述晶体和液相性质,包括振动光谱和升华焓,并处理水中的量子核效应?
  • RQ5在分子动力学模拟中的计算性能(速度与可扩展性)如何?

主要发现

  • 大型 MACE-OFF23 模型在能量/力 RMSE 上达到 ~0.5–1.0 meV/原子,分子间力约 ~15–20 meV/Å,显著低于所测试有机体的化学精度。
  • 分子间力误差约 ~5–15 meV/Å,约为总力误差的 1.5–3 倍更小,分子内误差约 1–2%。
  • 二面角势垒高度在双芳基基准测试中误差约 0.3–0.5 kcal/mol,在 TorsionNet-500 上约为 0.25 kcal/mol,与 SPICE DFT 水平对比,接近 DFT 参考精度。
  • MACE-OFF23(S/M/L) 能重现水的 RDF,与 TIP3P/MB-pol 相当,并且在包含量子核效应(PIGS)时,与实验的拉曼/红外特征在各频段一致。
  • 大型模型 MACE-OFF23 实现升华焓预测,23 个晶体的平均误差约 1.7 kcal/mol,与色散修正泛函相当。
  • 对于液体,密度 MAE 约为 ~0.09 g/cm3(M 模型),有合理的汽化热预测,且对水/醚/二溴等情况进行了讨论,观察到系统性趋势和可能的误差抵消。
Figure 2: Dihedral benchmark scans. The top panel shows torsion drive data for the TorsionNet-500 dataset, which has a wide chemical diversity (five example molecules are shown). The bottom panel focuses on the torsion angle between two aromatic rings in the biaryl torsion benchmark [ 66 ] which con
Figure 2: Dihedral benchmark scans. The top panel shows torsion drive data for the TorsionNet-500 dataset, which has a wide chemical diversity (five example molecules are shown). The bottom panel focuses on the torsion angle between two aromatic rings in the biaryl torsion benchmark [ 66 ] which con

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。