QUICK REVIEW

[論文レビュー] Building Normalizing Flows with Stochastic Interpolants

Michael S. Albergo, Eric Vanden‐Eijnden|arXiv (Cornell University)|Sep 30, 2022

Generative Adversarial Networks and Image Synthesis被引用数 21

ひとこと要約

InterFlowを導入します。基底密度と目標密度の間の確率流れから学習される連続時間正規化フローで、有限時間の輸送、パスに沿った尤度評価、ODEソルバーのバックプロパゲーションを伴わない効率的な学習を実現します。

ABSTRACT

A generative model based on a continuous-time normalizing flow between any pair of base and target probability densities is proposed. The velocity field of this flow is inferred from the probability current of a time-dependent density that interpolates between the base and the target in finite time. Unlike conventional normalizing flow inference methods based the maximum likelihood principle, which require costly backpropagation through ODE solvers, our interpolant approach leads to a simple quadratic loss for the velocity itself which is expressed in terms of expectations that are readily amenable to empirical estimation. The flow can be used to generate samples from either the base or target, and to estimate the likelihood at any time along the interpolant. In addition, the flow can be optimized to minimize the path length of the interpolant density, thereby paving the way for building optimal transport maps. In situations where the base is a Gaussian density, we also show that the velocity of our normalizing flow can also be used to construct a diffusion model to sample the target as well as estimate its score. However, our approach shows that we can bypass this diffusion completely and work at the level of the probability flow with greater simplicity, opening an avenue for methods based solely on ordinary differential equations as an alternative to those based on stochastic differential equations. Benchmarking on density estimation tasks illustrates that the learned flow can match and surpass conventional continuous flows at a fraction of the cost, and compares well with diffusions on image generation on CIFAR-10 and ImageNet $32\times32$. The method scales ab-initio ODE flows to previously unreachable image resolutions, demonstrated up to $128\times128$.

研究の動機と目的

基底密度と目標密度の間の効率的な輸送を、連続時間の枠組みで動機づける。
連続方程式を満たすことを強制する簡単な二次目的を最小化して速度場v_tを推定する。
挿入経路に沿ったサンプリングと尤度推定を可能にし、最適輸送へ接続する。
微分方程式ソルバーのバックプロパゲーションを回避しつつ、密度推定と画像生成タスクでスケーラビリティと競争力のある性能を示す。

提案手法

x_t = I_t(x_0,x_1)という確率的挿入体を定義する（x_0 ~ ρ_0、x_1 ~ ρ_1）。
ρ_t(x)が、速度場v_t(x)を用いて二次目的関数G(v)を最小化する連続方程式を満たすことを示す。
G(v)とその最小化解を、ρ_0、ρ_1、およびtからのサンプルの期待として表現し、経験的推定を可能にする。
挿入経路の長さが、挿入体I_tおよび/または基底ρ_0を調整して最適輸送（Benamou–Brenier）に近づけることで最適化できることを示す。
ρ_0がガウス分布である場合、速度とスコアの関連を示し、スコアベースの拡散概念への接続を提供し、サンプリングのSDE解釈を導く。

実験結果

リサーチクエスチョン

RQ1確率的挿入体から直接速度場を学習し、得られるρ_tがρ_0とρ_1間の連続方程式を満たすようにできるか？
RQ2トレーニング目的は真のターゲットと輸送された基底密度間のWasserstein-2距離とどのように関係するか？
RQ3挿入体I_t（および可能なら基底）を最適化して輸送経路を短縮し、最適輸送マップを得られるか？
RQ4この速度ベースで挿入体駆動のアプローチは、密度推定と画像生成におけるMLEベースの連続フローや拡散モデルとどう比較されるか？

主な発見

挿入体誘導密度ρ_tは、G(v)を最小化する一意の速度場v_tと連続方程式を満たす。
G(v)はサンプルから経験的に推定でき、速度場の訓練をシミュレーションなしで実施できる。
このフレームワークは基底とターゲットの両方からサンプルを生成し、挿入経路に沿った尤度を計算できる。
挿入体上のG(v)の最小値を最大化することで、適切な条件の下でBenamou–Brenierの最適輸送解に対応する経路を得られる。
Gaussian基底ρ_0と正弦波挿入体では、挿入体の速度が密度スコアと関連し、スコアベースのモデルに繋がる；SDEベースのサンプリング解釈をサポートする。
経験的な結果は、表データでの競争力のある密度推定性能、128×128までの拡張可能な画像生成、CIFAR-10およびImageNet 32×32での競争力のあるNLL/FIDを示し、先行の連続フローや拡散アプローチと比較して良好な結果を示す。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。