[论文解读] Tutorial: Deriving the Standard Variational Autoencoder (VAE) Loss Function
本论文给出VAE损失的逐步推导,推导变分下界(ELBO)及其高斯潜变量情形的闭式解,包括贝叶斯定理和KL散度基础。
In Bayesian machine learning, the posterior distribution is typically computationally intractable, hence variational inference is often required. In this approach, an evidence lower bound on the log likelihood of data is maximized during training. Variational Autoencoders (VAE) are one important example where variational inference is utilized. In this tutorial, we derive the variational lower bound loss function of the standard variational autoencoder. We do so in the instance of a gaussian latent prior and gaussian approximate posterior, under which assumptions the Kullback-Leibler term in the variational lower bound has a closed form solution. We derive essentially everything we use along the way; everything from Bayes' theorem to the Kullback-Leibler divergence.
研究动机与目标
- Motivate why variational inference is used in Bayesian learning and why posterior intractability motivates VAEs.
- Derive the variational lower bound (ELBO) for VAEs by applying Bayes’ theorem and KL divergence properties.
- Show that with Gaussian latent prior and posterior the KL term has a closed form, leading to the standard VAE loss.
- Provide the final loss function form and connection to reconstruction and regularization terms.
提出的方法
- Derive Bayes’ theorem and the ELBO for latent variable models.
- Express the ELBO as a sum of a KL divergence term and a reconstruction term.
- Assume Gaussian forms for p(z) and q_theta(z|x) to obtain a closed-form KL term.
- Compute the closed-form of -D_KL(q_theta(z|x)||p(z)) under Gaussian assumptions and simplify to a tractable expression.
- Formulate the final objective as a maximization of the ELBO, then convert to the loss by negation for training, yielding the standard VAE loss.
实验结果
研究问题
- RQ1How can variational inference be applied to train a generative model with intractable posteriors in VAEs?
- RQ2What is the closed-form expression of the KL term when latent variables are Gaussian, and how does this shape the VAE loss?
- RQ3How can Bayes’ theorem and KL divergence be used to derive an ELBO suitable for optimization in VAEs?
- RQ4What is the final loss formulation that practitioners minimize during VAE training?
主要发现
- The ELBO provides a tractable objective that upper bounds the log likelihood via a KL regularizer and a reconstruction term.
- Under Gaussian priors and posteriors, the KL term has a closed-form expression, enabling a closed-form VAE loss.
- The loss combines a KL-based regularization term with a reconstruction loss derived from the decoder likelihood.
- Maximizing the ELBO (or minimizing the negative ELBO) yields training objectives for the encoder and decoder parameters.
- The final presented loss decomposes into a sum over latent dimensions with a reparameterization-based sampling component for stochastic optimization.
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。