QUICK REVIEW

[論文レビュー] Hopfield Networks is All You Need

Hubert Ramsauer, Bernhard Schäfl|arXiv (Cornell University)|Jul 16, 2020

Cognitive Science and Education Research参考文献 106被引用数 72

ひとこと要約

この論文は、現代的な連続状態 Hopfield ネットワークを導入し、それらは differentiable memory 層として機能し、transformer の注意機構と同等で、指数関数的な記憶とワンショットの取得を可能にし、 MIL、免疫レパートア、小規模データセット、薬物設計における有効性を実証します。

ABSTRACT

We introduce a modern Hopfield network with continuous states and a corresponding update rule. The new Hopfield network can store exponentially (with the dimension of the associative space) many patterns, retrieves the pattern with one update, and has exponentially small retrieval errors. It has three types of energy minima (fixed points of the update): (1) global fixed point averaging over all patterns, (2) metastable states averaging over a subset of patterns, and (3) fixed points which store a single pattern. The new update rule is equivalent to the attention mechanism used in transformers. This equivalence enables a characterization of the heads of transformer models. These heads perform in the first layers preferably global averaging and in higher layers partial averaging via metastable states. The new modern Hopfield network can be integrated into deep learning architectures as layers to allow the storage of and access to raw input data, intermediate results, or learned prototypes. These Hopfield layers enable new ways of deep learning, beyond fully-connected, convolutional, or recurrent networks, and provide pooling, memory, association, and attention mechanisms. We demonstrate the broad applicability of the Hopfield layers across various domains. Hopfield layers improved state-of-the-art on three out of four considered multiple instance learning problems as well as on immune repertoire classification with several hundreds of thousands of instances. On the UCI benchmark collections of small classification tasks, where deep learning methods typically struggle, Hopfield layers yielded a new state-of-the-art when compared to different machine learning methods. Finally, Hopfield layers achieved state-of-the-art on two drug design datasets. The implementation is available at: https://github.com/ml-jku/hopfield-layers

研究の動機と目的

RNN の代替としてメモリ拡張型アーキテクチャを動機づけ、深層ネットワークにおける記憶の保存と取得を改善する。
新しいエネルギー関数を持つ微分可能で連続状態の Hopfield ネットワークを提案し、1 回更新での検索を実現する。
Hopfield 層を深層アーキテクチャへプーリング、メモリ、または注意機構として統合できることを示す。
MIL、小規模分類タスク、免疫レパートワ分類、薬物設計にわたる広い適用性を示す。

提案手法

連続状態 Hopfield ネットワークのための新しいエネルギー関数 E を定義し、-lse と二次項を組み合わせてノルムを制限する。
グローバルに E の定常点へ収束する one-update 更新規則 xi_new = X softmax(beta X^T xi) を導入する。
パターン分離とネットワークパラメータに基づく、収束性と指数的検索精度を証明する。
更新規則がトランスフォーマーのキー-バリュー注意機構（自己注意）に用いられるものと等価であることを示す。
深層ネットワークへの統合のための3 種類の Hopfield 層タイプ（Hopfield、HopfieldPooling、HopfieldLayer）を説明する。

実験結果

リサーチクエスチョン

RQ1連続状態を持つ現代の Hopfield ネットワークは、d 次元空間において指数関数的に多数のパターンを格納し、それらを1回の更新で高い精度で取り出すことができるか？
RQ2 Hopfield ネットワークを微分可能な層として深層アーキテクチャに統合し、メモリ、プーリング、注意を提供できるか？
RQ3 Hopfield ベースの層は MIL、免疫レパトリ分類、小規模 UCI タスク、および薬物設計データセットで性能を向上させるか？

主な発見

方法	Tiger	Fox	Elephant	UCSB
Hopfield (ours) \| HopfieldPooling?	91.3±0.5	64.05±0.4	94.9±0.3	89.5±0.8
Path encoding (Küçükaşcı & Baydoğan 2018)	91.0±1.0	71.2±1.4	94.4±0.7	88.0±2.2
MInD (Cheplygina et al., 2016)	85.3±1.1	70.4±1.6	93.6±0.9	83.1±2.7
MILES (Chen et al., 2006)	87.2±1.7	73.8±1.6	92.7±0.7	83.3±2.6
APR (Dietterich et al., 1997)	77.8±0.7	54.1±0.9	55.0±1.0	—
Citation-kNN (Wang, 2000)	85.5±0.9	63.5±1.5	89.6±0.9	70.6±3.2

次元数に対して記憶容量は指数関数的であり、特定条件下での証明可能な下限 N ≥ sqrt(p) c^{(d-1)/4} が成り立つ。
1 回の検索更新は、十分に分離されたパターンに対して通常 ε 近く固定点へ収束する（分離 Δ_i に対して指数関数的）。
Hopfield 層は、いくつかの MIL ベンチマーク（免疫レパトアと画像ベースの MIL データセット）で最先端の結果を達成している。
MIL ベンチマークにおいて、HopfieldPooling は Tiger、Fox、Elephant、UCSB Breast Cancer データセットで競争力のあるまたは優れた AUC スコアを示す。
HopfieldLayer は SVM、k-NN、LVQ を単一の層内でエミュレートでき、柔軟な分類を可能にする。
トランスフォーマーの注意機構は Hopfield の更新に対応しており、現代のメモリネットワークと自己注意を結ぶ。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。