QUICK REVIEW

[论文解读] Deep Residual Auto-Encoders for Expectation Maximization-based Dictionary Learning.

Bahareh Tolooshams, S. Dey|arXiv (Cornell University)|Apr 18, 2019

Gaussian Processes and Bayesian Inference被引用 4

一句话总结

本文提出约束循环稀疏自编码器（CRsAE），一种将期望最大化（EM）原理融入字典学习的深度残差自编码器，实现字典与修正线性单元（ReLU）偏置（正则化参数）的联合优化。该方法在图像去噪任务中表现优异，并在神经元动作电位检测中实现900倍的速度提升，显著优于凸优化方法。

ABSTRACT

We introduce a neural-network architecture, termed the constrained recurrent sparse autoencoder (CRsAE), that solves convolutional dictionary learning problems, thus establishing a link between dictionary learning and neural networks. Specifically, we leverage the interpretation of the alternating-minimization algorithm for dictionary learning as an approximate Expectation-Maximization algorithm to develop autoencoders that enable the simultaneous training of the dictionary and regularization parameter (ReLU bias). The forward pass of the encoder approximates the sufficient statistics of the E-step as the solution to a sparse coding problem, using an iterative proximal gradient algorithm called FISTA. The encoder can be interpreted either as a recurrent neural network or as a deep residual network, with two-sided ReLU non-linearities in both cases. The M-step is implemented via a two-stage back-propagation. The first stage relies on a linear decoder applied to the encoder and a norm-squared loss. It parallels the dictionary update step in dictionary learning. The second stage updates the regularization parameter by applying a loss function to the encoder that includes a prior on the parameter motivated by Bayesian statistics. We demonstrate in an image-denoising task that CRsAE learns Gabor-like filters, and that the EM-inspired approach for learning biases is superior to the conventional approach. In an application to recordings of electrical activity from the brain, we demonstrate that CRsAE learns realistic spike templates and speeds up the process of identifying spike times by 900x compared to algorithms based on convex optimization.

研究动机与目标

通过将交替最小化算法重新表述为近似期望最大化（EM）过程，弥合字典学习与深度神经网络之间的鸿沟。
在单一神经网络架构中实现字典与正则化参数（ReLU偏置）的端到端训练。
提升稀疏表示任务（如图像去噪与神经信号处理）的性能。
加速神经元动作电位时间的识别，该过程通常计算成本高昂。

提出的方法

编码器通过基于FISTA的近端梯度算法求解稀疏编码问题，近似于EM的E步。
编码器可被解释为具有双侧ReLU非线性的深度残差网络或循环神经网络。
M步通过两阶段反向传播实现：首先使用线性解码器与平方范数损失更新字典。
其次，通过在编码器上应用基于贝叶斯先验的损失函数，更新正则化参数。
该架构通过反向传播实现字典与偏置的联合优化，避免了独立的优化步骤。
模型通过包含重构误差与偏置参数先验的联合损失函数，实现端到端反向传播训练。

实验结果

研究问题

RQ1字典学习的交替最小化算法能否被重新解释为近似EM算法，从而实现神经网络的端到端训练？
RQ2深度残差自编码器架构能否以可微分方式联合优化字典与修正线性单元（ReLU）偏置（正则化参数）？
RQ3受EM启发的正则化参数学习方法是否在稀疏编码任务中优于传统固定或人工调优的正则化方法？
RQ4与基于凸优化的方法相比，所提出方法能否显著加速电生理记录中神经元动作电位的检测？

主要发现

CRsAE在图像去噪任务中成功学习到类似Gabor的滤波器，表明其具备有效的特征学习能力。
受EM启发的正则化参数（ReLU偏置）学习方法在图像去噪性能上优于传统方法。
在神经元动作电位检测中，CRsAE相比基于凸优化的算法实现了900倍的速度提升。
该模型能从脑电信号记录中学习到具有生物合理性的动作电位模板。
两阶段反向传播机制实现了字典与偏置参数的稳定且高效的联合优化。
该架构可被双重解释为残差网络或循环网络，提升了实现灵活性，且不损害训练性能。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。