QUICK REVIEW

[論文レビュー] Deep Residual Auto-Encoders for Expectation Maximization-based Dictionary Learning.

Bahareh Tolooshams, S. Dey|arXiv (Cornell University)|Apr 18, 2019

Gaussian Processes and Bayesian Inference被引用数 4

ひとこと要約

本稿では、期待最大化（EM）の原則を辞書学習に統合することで、辞書とReLUバイアス（正則化パラメータ）の共同最適化を可能にする、深層残差オートエンコーダーである制約付き再帰的スパースオートエンコーダー（CRsAE）を提案する。この手法は、凸最適化手法と比較して、画像ノイズ除去において優れた性能を発揮するとともに、神経スパイク検出において900倍の高速化を達成する。

ABSTRACT

We introduce a neural-network architecture, termed the constrained recurrent sparse autoencoder (CRsAE), that solves convolutional dictionary learning problems, thus establishing a link between dictionary learning and neural networks. Specifically, we leverage the interpretation of the alternating-minimization algorithm for dictionary learning as an approximate Expectation-Maximization algorithm to develop autoencoders that enable the simultaneous training of the dictionary and regularization parameter (ReLU bias). The forward pass of the encoder approximates the sufficient statistics of the E-step as the solution to a sparse coding problem, using an iterative proximal gradient algorithm called FISTA. The encoder can be interpreted either as a recurrent neural network or as a deep residual network, with two-sided ReLU non-linearities in both cases. The M-step is implemented via a two-stage back-propagation. The first stage relies on a linear decoder applied to the encoder and a norm-squared loss. It parallels the dictionary update step in dictionary learning. The second stage updates the regularization parameter by applying a loss function to the encoder that includes a prior on the parameter motivated by Bayesian statistics. We demonstrate in an image-denoising task that CRsAE learns Gabor-like filters, and that the EM-inspired approach for learning biases is superior to the conventional approach. In an application to recordings of electrical activity from the brain, we demonstrate that CRsAE learns realistic spike templates and speeds up the process of identifying spike times by 900x compared to algorithms based on convex optimization.

研究の動機と目的

交互最小化アルゴリズムを近似的な期待最大化（EM）プロセスとして再定式化することで、辞書学習とディープニューラルネットワークを統合すること。
単一のニューラルネットワークアーキテクチャ内で、辞書と正則化パラメータ（ReLUバイアス）の両方をエンドツーエンドで訓練できることを実現すること。
画像ノイズ除去や神経信号処理を含むスパース表現タスクにおける性能向上を図ること。
通常は計算コストが非常に高いとされる神経スパイク時刻の特定を高速化すること。

提案手法

エンコーダーは、スパースコーディング問題を解くために反復的FISTAベースの近接勾配法を実装し、EMのEステップを近似する。
エンコーダーは、二重側ReLU非線形性を有する深層残差ネットワークまたは再帰的ニューラルネットワークとして解釈できる。
Mステップは二段階のバックプロパゲーションにより実装される：まず、線形デコーダーとノルム二乗損失を用いて辞書を更新する。
次に、エンコーダーに適用されるベイズ的事前分布に基づいた損失関数を用いて、正則化パラメータを更新する。
バックプロパゲーションにより、辞書とバイアスの共同最適化が可能となり、別々の最適化ステップを回避できる。
再構成誤差とバイアスパラメータの事前分布を含む組み合わせ損失関数を用いて、バックプロパゲーションによるエンドツーエンドの学習が実行される。

実験結果

リサーチクエスチョン

RQ1辞書学習のための交互最小化アルゴリズムを、エンドツーエンドのニューラルネットワーク学習を可能にする近似的なEMアルゴリズムとして再解釈できるか？
RQ2深層残差オートエンコーダーのアーキテクチャが、微分可能かつ同時に辞書とReLUバイアス（正則化パラメータ）を最適化できるか？
RQ3EMにインspiredされたバイアスパラメータの学習は、従来の固定または手動で調整された正則化よりも、スパースコーディングタスクで優れた性能を発揮するか？
RQ4提案手法は、凸最適化に基づく手法と比較して、電気生理的記録における神経スパイク検出を顕著に高速化できるか？

主な発見

CRsAEは、画像ノイズ除去タスクにおいてギャバーフィルタに類似したフィルタを効果的に学習しており、優れた特徴抽出能力を示している。
正則化パラメータ（ReLUバイアス）の学習にEMにインspiredされたアプローチを用いることで、従来手法に比べて画像ノイズ除去性能が向上した。
神経スパイク検出において、CRsAEは凸最適化に基づくアルゴリズムと比較して、スパイク時刻の特定において900倍の高速化を達成した。
脳の電気的記録から、生物学的に現実的なスパイクテンプレートを学習しており、生物学的妥当性を示している。
二段階のバックプロパゲーション機構により、辞書とバイアスパラメータの安定的かつ効果的な共同最適化が可能になった。
残差ネットワークまたは再帰的ネットワークとしての二重解釈が可能であり、学習性能に影響を与えることなく実装の柔軟性を提供している。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。