QUICK REVIEW

[论文解读] Flow Matching for Generative Modeling

Yaron Lipman, Ricky T. Q. Chen|arXiv (Cornell University)|Oct 6, 2022

Generative Adversarial Networks and Image Synthesis被引用 86

一句话总结

一个无需仿真即可训练连续正则化流 CNF 的仿真-free 训练框架 Flow Matching (FM)，其使用逐样本条件概率路径（包括一个最优传输 OT 路径）来实现可扩展、高效的生成，以及在似然和样本质量方面相对扩散式方法的提升。

ABSTRACT

We introduce a new paradigm for generative modeling built on Continuous Normalizing Flows (CNFs), allowing us to train CNFs at unprecedented scale. Specifically, we present the notion of Flow Matching (FM), a simulation-free approach for training CNFs based on regressing vector fields of fixed conditional probability paths. Flow Matching is compatible with a general family of Gaussian probability paths for transforming between noise and data samples -- which subsumes existing diffusion paths as specific instances. Interestingly, we find that employing FM with diffusion paths results in a more robust and stable alternative for training diffusion models. Furthermore, Flow Matching opens the door to training CNFs with other, non-diffusion probability paths. An instance of particular interest is using Optimal Transport (OT) displacement interpolation to define the conditional probability paths. These paths are more efficient than diffusion paths, provide faster training and sampling, and result in better generalization. Training CNFs using Flow Matching on ImageNet leads to consistently better performance than alternative diffusion-based methods in terms of both likelihood and sample quality, and allows fast and reliable sample generation using off-the-shelf numerical ODE solvers.

研究动机与目标

开发一个可扩展的、无需仿真的 CNF 训练目标。
利用逐样本条件概率路径构建 CNF 训练的可解靶标。
在 Flow Matching 内探索包含扩散和 OT 的一般概率路径家族。
证明 Flow Matching 在图像数据集上在似然和样本质量方面优于基于扩散的方法。
表明基于 OT 的路径在训练、采样和泛化方面更快且更好。

提出的方法

将 Flow Matching (FM) 目标定义为回归神经向量场 v_t 至目标路径生成场 u_t。
从条件概率路径 p_t(x|x1) 与条件向量场 u_t(x|x1) 构建 p_t 与 u_t，并聚合得到边际 p_t 与 u_t。
使用条件流动匹配（CFM），其梯度与 FM 等价，使得在没有显式边际靶标的情况下进行逐样本训练成为可能。
采用广义高斯条件路径 p_t(x|x1)，其均值为 mu_t(x1)，标准差为 sigma_t(x1)，并通过流映射 psi_t 推导条件向量场 u_t(x|x1)。
专门化为扩散基路径（VE 与 VP）和最优传输（OT）位移插值，强调 OT 的线性、直线轨迹及更简化的回归靶标。
在 ImageNet 上用 OT 路径的 Flow Matching (FM) 训练 CNFs，并在似然（NLL/BPD）和 FID，以及采样效率方面与扩散基线进行比较。

实验结果

研究问题

RQ1一个无需仿真的 Flow Matching 目标是否能在不为每一步求解 ODE 的情况下实现大规模训练 CNF？
RQ2条件概率路径（扩散 vs OT）在训练稳定性、采样效率和模型质量方面的差异如何？
RQ3在 Flow Matching 中使用基于 OT 的条件路径是否比扩散路径提供更快的训练和更好的泛化？
RQ4Flow Matching 在大规模数据集（ImageNet）上的表现相对于扩散方法，在似然和样本质量方面如何？
RQ5Flow Matching 是否能够在现成的 ODE 求解器下实现可靠的条件生成和快速采样？

主要发现

Model	CIFAR-10 NLL（BPD）	CIFAR-10 FID	CIFAR-10 NFE	ImageNet 32x32 NLL（BPD）	ImageNet 32x32 FID	ImageNet 32x32 NFE	ImageNet 64x64 NLL（BPD）	ImageNet 64x64 FID	ImageNet 64x64 NFE
DDPM	3.12	7.48	274	3.54	6.99	262	3.32	17.36	264
Score Matching	3.16	19.94	242	3.56	5.68	178	3.40	19.74	441
ScoreFlow	3.09	20.78	428	3.55	14.14	195	3.36	24.95	601
FM w/Diffusion	3.10	8.06	183	3.54	6.37	193	3.33	16.88	187
FM w/ OT	2.99	6.35	142	3.53	5.02	122	3.31	14.45	138
FM w/ OT (ImageNet 128)	2.90	20.9	-	-	-	-	-	-	-

使用 OT 路径的 Flow Matching 在 CIFAR-10 和 ImageNet 变体上，NLL（BPD）和 FID 更好，且通常比扩散基线具有更低的 NFE。
FM-OT 在表1中的 CIFAR-10 和 ImageNet 32x32/64x64 上始终获得基线中的最佳结果（NLL、FID、NFE）。
在 ImageNet-128x128 上，FM w/ OT 达到有竞争力的 NLL (2.90) 和 FID (20.9)，相对于列出的多种 GAN 基方法，Flow Matching 提供强大的似然性和样本质量。
Flow Matching 与 OT 使采样更快：在相同数值精度下，OT 路径比扩散路径需要更少的函数评估（NFE），并提供更好的成本-质量折衷。
CFM 提供与 FM 等效的梯度，使在没有显式边际向量场的情况下进行可行的逐样本训练成为可能。
OT 路径在潜在空间中产生直线轨迹，从而得到更简单的回归靶标并相比扩散路径实现更高效的训练/采样。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。