QUICK REVIEW

[论文解读] On denoising autoencoders trained to minimise binary cross-entropy

Antonia Creswell, Kai Arulkumaran|arXiv (Cornell University)|Aug 28, 2017

Generative Adversarial Networks and Image Synthesis参考文献 20被引用 63

一句话总结

这篇论文表明，用二元交叉熵训练的去噪自编码器（DAEs）能够近似数据密度的梯度，从而在数据空间中朝向更高似然性的区域移动，并实现从噪声中合成新样本以及从其他生成模型改进样本。

ABSTRACT

Denoising autoencoders (DAEs) are powerful deep learning models used for feature extraction, data generation and network pre-training. DAEs consist of an encoder and decoder which may be trained simultaneously to minimise a loss (function) between an input and the reconstruction of a corrupted version of the input. There are two common loss functions used for training autoencoders, these include the mean-squared error (MSE) and the binary cross-entropy (BCE). When training autoencoders on image data a natural choice of loss function is BCE, since pixel values may be normalised to take values in [0,1] and the decoder model may be designed to generate samples that take values in (0,1). We show theoretically that DAEs trained to minimise BCE may be used to take gradient steps in the data space towards regions of high probability under the data-generating distribution. Previously this had only been shown for DAEs trained using MSE. As a consequence of the theory, iterative application of a trained DAE moves a data sample from regions of low probability to regions of higher probability under the data-generating distribution. Firstly, we validate the theory by showing that novel data samples, consistent with the training data, may be synthesised when the initial data samples are random noise. Secondly, we motivate the theory by showing that initial data samples synthesised via other methods may be improved via iterative application of a trained DAE to those initial samples.

研究动机与目标

动机并扩展用重构损失训练的 DAE 与数据生成分布相关的理论。
证明基于 BCE 的 DAE 近似对数概率密度梯度 log p(x) 并在数据空间诱导梯度上升。
展示从随机噪声进行实际采样以及对生成自编码器样本的改进。
应用该理论通过去噪准则提升变分自编码器（VAE）和对抗自编码器（AAE）的样本。

提出的方法

推导基于 BCE 的去噪自编码器重构目标并在 BCE 下推导最优重构函数（导出数据空间中的梯度）。
证明在极限下，BCE 训练的 DAE 近似对数密度梯度，类似于均方误差（MSE）训练的 DAE。
在 CelebA 数据上训练 VAE 和 AAE 的去噪变体，并使用叠加高斯污染。
演示利用训练得到的重构函数从噪声迭代采样到接近数据的样本。
证明迭代应用可提升 DVAE 和 DAAE 生成的样本。

实验结果

研究问题

RQ1基于 BCE 训练的 DAE 能否近似对数据的生成分布相对于数据的梯度？
RQ2BCE 训练的去噪自编码器的迭代应用是否会将样本朝向数据分布的高概率区域移动？
RQ3基于 BCE 的 DAE 是否能提升如 DVAE 和 DAAE 等生成模型的样本质量？
RQ4是否可以从随机噪声开始，利用 BCE-DAEs 合成与训练数据一致的新样本？
RQ5哪些实用指导或技术（如噪声增强）有助于从这类模型进行采样？

主要发现

BCE 损失产生的重构函数在极限时对应于朝向更高数据似然性的梯度上升步。
从随机噪声出发对训练好的 BCE-DAE 进行迭代应用可以产生与训练数据一致的新样本。
该采样方法可以提升从 DVAE 和 DAAE 模型得到的初始样本。
在迭代之间加入小幅噪声可以平滑数据空间，帮助向更高密度区域过渡。
该方法提供了一条在不重新训练的情况下通过将训练好的 BCE-DAE 作为生成算子来改进样本的途径。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。