QUICK REVIEW

[论文解读] CatFlow: Co-generation of Slab-Adsorbate Systems via Flow Matching

Minkyu Kim, Nayoung Kim|arXiv (Cornell University)|Feb 5, 2026

Machine Learning in Materials Science被引用 0

一句话总结

CatFlow 通过 flow matching 与因子化的 slab-adsorbate 表示共同生成 slab 结构与吸附体坐标，在 OC20 的去新生成和结构预测任务中，相较基线在结构保真度和吸附能对齐方面得到改进。

ABSTRACT

Discovering heterogeneous catalysts tailored for specific reaction intermediates remains a fundamental bottleneck in materials science. While traditional trial-and-error methods and recent generative models have shown promise, they struggle to capture the intrinsic coupling between surface geometry and adsorbate interactions. To address this limitation, we propose CatFlow, a flow matching-based framework for de novo design and structure prediction of heterogeneous catalysts. Our model operates on a primitive cell-based factorized representation of the slab-adsorbate complex, reducing the number of learnable variables by an average of 9.2x while explicitly encoding the surface orientation of the slab-adsorbate interface. Experiments on the Open Catalyst 2020 dataset demonstrate that CatFlow significantly improves the structural fidelity of generated catalysts compared to autoregressive and sequential baselines. Further experiments show that the generated structures accurately capture the adsorption energy distributions of physically plausible interfaces and lie closer to thermodynamic local minima.

研究动机与目标

通过耦合表面几何形状与吸附体相互作用，解决发现异质催化剂的瓶颈。
提出一个统一框架，能够共同生成 slab 结构和吸附体坐标。
引入因子化表示，在保持表面取向信息的同时降低模型维度。
在 OC20 基准上展示端到端的生成与结构预测，具有更高的保真度和能量学优越性。

提出的方法

定义条件联合分布 p(S_prim, M, k_vac, x_ads | a_ads) 并训练单一模型，结合连续与离散流量匹配来共同生成 slab-adsorbate 系统。
引入 slab-adsorbate 系统的因子化表示为原胞、变换矩阵、真空缩放因子及吸附体分量，以在保持表面取向信息的同时减少可学习变量。
对于原子种类使用掩蔽的离散流量匹配，对几何变量使用连续流量匹配（包含重新参数化和训练中的放松 M）。
使用基于 Transformer 的神经网络（受 DiT 启发的编码器/解码器）处理联合原子层级表示，并同时预测连续坐标和离散组成。
在推理阶段，对连续变量求解微分方程（ODE），对离散标记进行迭代性掩蔽以生成去新结构，或固定组成用于结构预测。

Figure 1 : Visualization of the co-generation trajectory conditioned on the adsorbate. We illustrate the synchronized evolution of the slab-adsorbate system from the initial noise distribution ( $t=0$ ) to the final structure ( $t=1$ ) for de novo generation (top) and structure prediction (bottom).

实验结果

研究问题

RQ1CatFlow 能否在捕获表面-吸附相互作用的同时，联合生成 slab 结构与吸附体位置？
RQ2因子化表示是否在不牺牲表面取向和物理有效性的前提下降低维度？
RQ3端到端的共同生成与分模块管线（如 DiffCSP + AdsorbDiff）在去新生成和结构预测上的比较如何？
RQ4在多种吸附体中，相对参考极小点的生成吸附能分布质量如何？

主要发现

CatFlow 在去新生成方面在有效性、唯一性和收敛效率上优于 CatGPT，达到 97.33% 的有效性（对比 92.67%）与 94.69% 的唯一性（对比 79.91%），系统能量变化更低且收敛更快。
在结构预测方面，CatFlow 在有效性（98.16% 对 64.95%）、匹配率（11.09% 对 0.01%）、RMSD（0.0973 对 0.1833）和吸附能成功率（9.72% 对 1.85%）方面显著超过两步基线 DC+AD。
CatFlow 能生成的吸附能分布与参考能量在多种吸附体中更接近物理可行的局部极小值，指示生成的构型接近物理合理性。
因子化表示降低了维度（平均约 9.2x，最高可达 96x），在保持表面取向信息的同时实现 slab-adsorbate 系统的高效协同生成。
端到端框架显式建模表面-吸附相互作用，并在吸附体身份条件化下进行推理，具备超越预定义吸附体标记的泛化能力。

Figure 2 : Histogram of atom counts in catalyst structures. We compare the histograms of atom counts for slab structures (blue) and their corresponding primitive cells (green). The primitive cells require fewer atoms than the slab structures, reducing the number of learnable variables for the genera

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。