QUICK REVIEW

[论文解读] DiGress: Discrete Denoising diffusion for graph generation

Clément Vignac, Igor Krawczuk|arXiv (Cornell University)|Sep 29, 2022

Advanced Graph Neural Networks被引用 70

一句话总结

DiGress 引入一种离散去噪扩散模型用于带有分类节点和边属性的图，使用图 Transformer 来反转马尔可夫离散扩散同时保持稀疏性。它在分子图和非分子图上实现了最先进的结果，并能扩展到大规模数据集。

ABSTRACT

This work introduces DiGress, a discrete denoising diffusion model for generating graphs with categorical node and edge attributes. Our model utilizes a discrete diffusion process that progressively edits graphs with noise, through the process of adding or removing edges and changing the categories. A graph transformer network is trained to revert this process, simplifying the problem of distribution learning over graphs into a sequence of node and edge classification tasks. We further improve sample quality by introducing a Markovian noise model that preserves the marginal distribution of node and edge types during diffusion, and by incorporating auxiliary graph-theoretic features. A procedure for conditioning the generation on graph-level features is also proposed. DiGress achieves state-of-the-art performance on molecular and non-molecular datasets, with up to 3x validity improvement on a planar graph dataset. It is also the first model to scale to the large GuacaMol dataset containing 1.3M drug-like molecules without the use of molecule-specific representations.

研究动机与目标

通过离散扩散来保持稀疏性和结构性，激发图生成的研究动机。
开发作用于离散节点/边类别的扩散过程。
训练一个图 Transformer 来从嘈杂图中去噪并重建干净的图。
通过边际保持噪声模型和辅助特征来提升性能。
通过离散引导和辅助特征实现条件图生成。

提出的方法

用节点和边类别的马尔可夫转移矩阵 Q^t_X 和 Q^t_E 定义离散扩散。
通过从 q(G^t|G^{t-1}) = (X^{t-1} Q^t_X, E^{t-1} Q^t_E) 采样并对无向图对称化来扩散 G^t。
训练一个置换等价的图 Transformer phi_theta，以通过最小化交叉熵损失来预测干净的节点/边分布，损失为 l = sum_i CE(x_i, p_i^X) + lambda sum_{i,j} CE(e_{ij}, p_{ij}^E)。
将反扩散 p_theta(G^{t-1}|G^t) 表示为对节点和边的乘积，通过对离散预测求边际化得到 p_theta(x_i^{t-1}|G^t) 和 p_theta(e_{ij}^{t-1}|G^t)。
通过与数据边际匹配的边际噪声先验 q_X, q_E 来改进训练，并用结构/谱特征来增强输入。
引入离散引导，通过性质回归器 g_eta 来引导采样朝向具有目标性质的图。

实验结果

研究问题

RQ1离散扩散在图属性上是否能有效建模复杂的图分布并保持稀疏性？
RQ2边际概率保持的噪声模型是否能提升扩散训练和图的样本质量？
RQ3哪些架构和特征增强（如结构/谱特征）能提升图的去噪性能？
RQ4DiGress 是否支持通过离散引导和图级属性实现条件图生成？
RQ5与自回归及其他单次模型相比，DiGress 在大分子数据集上的可扩展性如何？

主要发现

Deg	Clus	Orb	V.U.N.
6.9	1.7	3.1	5%
1.4	1.2	1.7	75%

DiGress 在分子和非分子图生成基准上实现了最先进的性能。
在平面图上，DiGress 的有效性比基线高出最多 3 倍。
DiGress 是首个在 GuacaMol（1.3M 分子）上扩展到单次图模型且不需要分子专用表示的方法。
使用边际转移噪声在训练和样本质量方面优于均匀噪声。
在 QM9 条件化实验中，离散引导降低了目标性质的平均绝对误差。
DiGress 在大规模 MOSES 上与自回归模型相当，在 GuacaMol 上具有具有竞争力的指标，显示了可扩展性。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。