QUICK REVIEW

[论文解读] Generative Models and Model Criticism via Optimized Maximum Mean Discrepancy

Danica J. Sutherland, Hsiao-Yu Fish Tung|arXiv (Cornell University)|Nov 14, 2016

Generative Adversarial Networks and Image Synthesis被引用 108

一句话总结

本文提出一种通过选择核和特征映射来最佳区分分布，从而优化最大均值差异（MMD）两样本检验功效的方法，并将其应用于GAN评估、模型批评和训练标准。

ABSTRACT

We propose a method to optimize the representation and distinguishability of samples from two probability distributions, by maximizing the estimated power of a statistical test based on the maximum mean discrepancy (MMD). This optimized MMD is applied to the setting of unsupervised learning by generative adversarial networks (GAN), in which a model attempts to generate realistic samples, and a discriminator attempts to tell these apart from data samples. In this context, the MMD may be used in two roles: first, as a discriminator, either directly on the samples, or on features of the samples. Second, the MMD can be used to evaluate the performance of a generative model, by testing the model's samples against a reference data set. In the latter role, the optimized MMD is particularly helpful, as it gives an interpretable indication of how the model and data distributions differ, even in cases where individual model samples are not easily distinguished either by eye or by classifier.

研究动机与目标

Motivate and formalize the use of MMD for distribution similarity in high-dimensional or structured data.
Develop a procedure to maximize the power of the MMD test by tuning kernels and feature representations.
Provide efficient, data-dependent thresholds for permutation-based MMD tests.
Demonstrate the approach on GAN evaluation and model criticism, including visualization of distribution differences.
Propose MMD-based training criteria to improve generative models and sample diversity.

提出的方法

Define and estimate MMD between distributions P and Q using the U-statistic based estimator.
Derive a test power expression and show that power is maximized by maximizing the one-step t-statistic t_k(P,Q)=MMD_k^2(P,Q)/sqrt(V_m^{(k)}(P,Q)).
Introduce a kernel search over compositions k∘z, where z extracts meaningful features of inputs, and optimize over z and κ to maximize empirical t_k.
Provide a differentiable estimator for the variance V_m(P,Q) to enable gradient-based kernel/feature optimization.
Implement a training/testing data split to preserve test validity while learning the kernel (kernel selection on training data; final test on testing data).
Propose an efficient, differentiable expression for the variance estimator V̂_m(X,Y) and its use in gradient-based optimization.

实验结果

研究问题

RQ1How can the power of a kernel-based two-sample test be optimized via kernel and feature selection?
RQ2Does optimizing MMD test power yield more reliable discrimination between model samples and reference data than maximizing MMD alone?
RQ3Can optimized MMD be used for evaluating and diagnosing generative models (e.g., GANs) and for improving training criteria?
RQ4How can witness functions from optimized MMD be visualized to diagnose distributional differences?
RQ5What are efficient implementations for permutation-based MMD thresholds in high dimensions?

主要发现

Maximizing the MMD test power by optimizing the kernel and feature map yields higher test power than simply maximizing the MMD statistic.
An ARD-based kernel over output dimensions helps identify which coordinates differ meaningfully when the test is optimized.
The witness function from the optimized MMD highlights where the two distributions differ most, enabling interpretable diagnostics.
Optimized MMD criteria can be used as a training objective for GAN variants (e.g., gmmn, t-gmmn, and feature-matching GANs) to improve sample diversity and reduce mode collapse.
A highly optimized permutation test for the null distribution markedly accelerates threshold computation, scaling as O(m^2) and offering substantial speedups over spectral methods (O(m^3)).
Empirical demonstrations on MNIST show that the optimized MMD discriminator detects subtle differences between model and data distributions, with learned feature weights revealing where the model deviates (e.g., image borders and central vertical line).

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。