Skip to main content
QUICK REVIEW

[論文レビュー] Biologically inspired alternatives to backpropagation through time for learning in recurrent neural nets

Guillaume Bellec, Franz Scherr|arXiv (Cornell University)|Jan 25, 2019
Advanced Memory and Neural Computing参考文献 52被引用数 77
ひとこと要約

本論文は、再帰的ネットワークに対する time 方向の backpropagation のオンラインで生物学的に妥当な代替として、eligibility traces と局所的な学習信号を用いる e-prop と呼ばれる手法を提案する。3つの変種(e-prop 1–3)と、スパイキングネットワークおよび LSTM ネットワークへの適用がある。

ABSTRACT

The way how recurrently connected networks of spiking neurons in the brain acquire powerful information processing capabilities through learning has remained a mystery. This lack of understanding is linked to a lack of learning algorithms for recurrent networks of spiking neurons (RSNNs) that are both functionally powerful and can be implemented by known biological mechanisms. Since RSNNs are simultaneously a primary target for implementations of brain-inspired circuits in neuromorphic hardware, this lack of algorithmic insight also hinders technological progress in that area. The gold standard for learning in recurrent neural networks in machine learning is back-propagation through time (BPTT), which implements stochastic gradient descent with regard to a given loss function. But BPTT is unrealistic from a biological perspective, since it requires a transmission of error signals backwards in time and in space, i.e., from post- to presynaptic neurons. We show that an online merging of locally available information during a computation with suitable top-down learning signals in real-time provides highly capable approximations to BPTT. For tasks where information on errors arises only late during a network computation, we enrich locally available information through feedforward eligibility traces of synapses that can easily be computed in an online manner. The resulting new generation of learning algorithms for recurrent neural networks provides a new understanding of network learning in the brain that can be tested experimentally. In addition, these algorithms provide efficient methods for on-chip training of RSNNs in neuromorphic hardware.

研究の動機と目的

  • Motivate the need for learning algorithms for recurrent networks of spiking neurons that are powerful yet biologically plausible.
  • Propose a factorization of the BPTT gradient into eligibility traces and online learning signals (e-prop).
  • Develop and analyze three variants (e-prop 1, 2, 3) to approximate gradient descent without backward-in-time error propagation.
  • Demonstrate online, task-based learning capabilities on RSNNs and compare to BPTT and other learning rules.
  • Discuss implications for neuroscience and neuromorphic hardware implementation of learning in RSNNs.

提案手法

  • Derive a factorization of the BPTT gradient: dE/dθ_{ji} = sum_t L_j^t e_{ji}^t (Equation 1).
  • Define eligibility traces e_{ji}^t via a forward-time update (Equation 2 and 3) using local dynamics D_j^{t-1} and \u0003bepsilon_{ji}^t.
  • Introduce online learning signals L_j^t as approximations to ideal gradients (L_j^t ≈ dE/dz_j^t, with online variants).
  • Develop e-prop 1: learning signals from broadcast alignment using instantaneous output error; provide a three-factor learning rule with local terms (Equation 5).
  • Develop e-prop 2: Learning-to-Learn (L2L) using error modules to generate task-specific learning signals while allowing the RSNN to adapt its weights; outer loop trains error modules.
  • Develop e-prop 3: integrate synthetic gradients with eligibility traces to improve performance beyond some BPTT baselines; show enhancements on recurrent networks.]
  • research_questions:
  • Can online, locally computable learning signals combined with eligibility traces approximate the performance of backpropagation through time for RSNNs?
  • Do biologically plausible approximations (e-prop variants) enable effective learning on tasks requiring temporal credit assignment (e.g., pattern generation, store-recall, speech tasks) compared to BPTT?
  • How do error-modulating mechanisms (broadcast alignment, error modules, synthetic gradients) influence learning power and biological plausibility?
  • Can these methods be extended to different network models (LIF, LSNN, LSTM) and still be computed online?
  • What implications do e-prop methods have for on-chip training in neuromorphic hardware?

実験結果

リサーチクエスチョン

  • RQ1Can online, locally computable learning signals combined with eligibility traces approximate the performance of backpropagation through time for RSNNs?
  • RQ2Do biologically plausible approximations (e-prop variants) enable effective learning on tasks requiring temporal credit assignment (e.g., pattern generation, store-recall, speech tasks) compared to BPTT?
  • RQ3How do error-modulating mechanisms (broadcast alignment, error modules, synthetic gradients) influence learning power and biological plausibility?
  • RQ4Can these methods be extended to different network models (LIF, LSNN, LSTM) and still be computed online?
  • RQ5What implications do e-prop methods have for on-chip training in neuromorphic hardware?

主な発見

  • e-prop can closely approximate BPTT by online merging of eligibility traces with learning signals, enabling real-time learning without backpropagated errors.
  • e-prop 1 using broadcast-alignment-like learning signals achieves effective credit assignment in RSNNs and LSNNs on pattern generation and store-recall tasks, and supports speech recognition (e-prop 1 works with TIMIT data).
  • e-prop 1 achieves competitive performance on pattern generation (three-dimensional target, 1 s) with mean squared error around 0.01 in a representative run; full BPTT can achieve lower error, but e-prop 1 remains effective.
  • Store-recall task with LSNNs solved by e-prop 1 achieving misclassification rate below 5% (average over 50 iterations); BPTT reaches similar or slightly faster convergence in fewer iterations.
  • e-prop 2 via L2L and e-prop 3 with synthetic gradients extend learning power; these approaches can enable one-shot learning and improve learning in RSNNs beyond certain baselines.
  • The framework links to biologically observed learning signals (ERN, dopaminergic modulation, etc.) and yields three-factor plasticity rules consistent with experimental data, while also enabling on-chip learning for neuromorphic hardware.

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。