[论文解读] Graph Diffusion Transformers for Multi-Conditional Molecular Generation
Proposes Multi-Conditional Diffusion (MCD) to guide graph diffusion models with multiple property constraints, enabling simultaneous control of numerical and categorical molecular properties in both polymers and small molecules.
Inverse molecular design with diffusion models holds great potential for advancements in material and drug discovery. Despite success in unconditional molecular generation, integrating multiple properties such as synthetic score and gas permeability as condition constraints into diffusion models remains unexplored. We present the Graph Diffusion Transformer (Graph DiT) for multi-conditional molecular generation. Graph DiT integrates an encoder to learn numerical and categorical property representations with the Transformer-based denoiser. Unlike previous graph diffusion models that add noise separately on the atoms and bonds in the forward diffusion process, Graph DiT is trained with a novel graph-dependent noise model for accurate estimation of graph-related noise in molecules. We extensively validate Graph DiT for multi-conditional polymer and small molecule generation. Results demonstrate the superiority of Graph DiT across nine metrics from distribution learning to condition control for molecular properties. A polymer inverse design task for gas separation with feedback from domain experts further demonstrates its practical utility.
研究动机与目标
- Motivate inverse molecular design with diffusion models when multiple properties must be satisfied.
- Develop a multi-conditional guidance mechanism to represent and integrate diverse numerical and categorical constraints.
- Design a graph-dependent diffusion process and a Transformer-based denoising model that uses condition representations to guide generation.
- Demonstrate multi-conditional generation on polymer and small-molecule datasets and assess practical utility via an inverse polymer design task for gas separation.
提出的方法
- Introduce Multi-Conditional Diffusion (MCD) with a condition encoder that learns representations for multiple numerical and categorical conditions.
- Use adaptive layer normalization to replace molecule statistics with condition statistics in the Transformer-based structure encoder.
- Propose a graph-dependent noise model with a joint node-edge diffusion matrix Q_G to better align noise with molecular graph structure.
- Employ a predictor-free guidance strategy that combines unconditional and conditional denoising probabilities with a controllable scale parameter s.
- Train a single Graph Transformer to perform both unconditional and conditional denoising, using a dropping embedding to handle missing conditions.
- Present a three-part architecture (condition encoder, structure encoder, structure decoder) and a practical conversion step for generating molecules from graphs.

实验结果
研究问题
- RQ1How can multiple numerical and categorical properties be integrated into diffusion-based molecular generation without conflating scales or types?
- RQ2Can a graph-aware noise model improve the realism and validity of generated molecular graphs under multi-conditional guidance?
- RQ3Does multi-conditional diffusion guidance enable generation that better satisfies multiple property constraints compared to single-condition baselines?
- RQ4Is predictor-free guidance effective for controlling multi-property outcomes in diffusion-based molecular design?
- RQ5Can the approach scale to both polymers and small molecules and deliver practical utility in inverse design tasks?
主要发现
- MCD generated polymers with higher alignment to multi-property constraints than single-condition baselines, with median ranks significantly exceeding 30 in tested scenarios.
- The model achieves lower average MAE on a polymer dataset for multiple numerical properties, reducing error by 17.86% compared to the best baseline.
- For small molecules, MCD attains over 0.9 accuracy on task-related categorical conditions, surpassing baseline accuracy (less than 0.6).
- Inverse polymer design for O2/N2 gas separation shows practical utility with domain expert feedback supporting multi-conditional design benefits.
- The approach demonstrates strong performance across distribution learning and condition control metrics on both polymer and drug-design datasets.

更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。