QUICK REVIEW

[论文解读] Training End-to-End Analog Neural Networks with Equilibrium Propagation

Jack D. Kendall, Ross D. Pantone|arXiv (Cornell University)|Jun 2, 2020

Advanced Memory and Neural Computing参考文献 49被引用 39

一句话总结

本文提出端到端的模拟神经网络，使用 Equilibrium Propagation (EqProp) 训练，权重为可编程电阻；它展示了仅使用电阻压降就能进行 SGD 兼容的梯度更新，并在隐藏层有 100 个神经元的 MNIST 上取得结果。

ABSTRACT

We introduce a principled method to train end-to-end analog neural networks by stochastic gradient descent. In these analog neural networks, the weights to be adjusted are implemented by the conductances of programmable resistive devices such as memristors [Chua, 1971], and the nonlinear transfer functions (or `activation functions') are implemented by nonlinear components such as diodes. We show mathematically that a class of analog neural networks (called nonlinear resistive networks) are energy-based models: they possess an energy function as a consequence of Kirchhoff's laws governing electrical circuits. This property enables us to train them using the Equilibrium Propagation framework [Scellier and Bengio, 2017]. Our update rule for each conductance, which is local and relies solely on the voltage drop across the corresponding resistor, is shown to compute the gradient of the loss function. Our numerical simulations, which use the SPICE-based Spectre simulation framework to simulate the dynamics of electrical circuits, demonstrate training on the MNIST classification task, performing comparably or better than equivalent-size software-based neural networks. Our work can guide the development of a new generation of ultra-fast, compact and low-power neural networks supporting on-chip learning.

研究动机与目标

Motivate a non-von Neumann hardware paradigm where learning happens at the synaptic location using analog resistive devices.
Show that nonlinear resistive networks are energy-based models (EBMs) enabling EqProp-based training.
Derive a local, conductance-update rule computable from resistor voltage drops.
Propose a deep analog network architecture with crossbar synapses and nonlinear neuron elements.
Demonstrate feasibility via SPICE-based MNIST experiments and compare to software EqProp models.

提出的方法

Model nonlinear resistive networks as EBMs with an energy function arising from Kirchhoff’s laws.
Derive Theorem 1: gradient of the loss w.r.t. a conductance can be estimated from the difference of squared voltage drops in nudged vs free phases as beta -> 0.
Use Equilibrium Propagation (free phase and nudged phase) to compute SGD-compatible conductance updates using only local voltage measurements.
Implement neurons with diodes to provide sigmoidal nonlinear transfer functions.
Design deep analog network architecture with programmable resistors as synapses and bidirectional amplifiers to propagate signals.
Encode loss gradients at output nodes via current sources in the nudged phase.

实验结果

研究问题

RQ1Can end-to-end analog neural networks be trained with SGD using only local information on conductances?
RQ2Do nonlinear resistive networks admit an energy-based formulation enabling Equilibrium Propagation?
RQ3What is the explicit conductance update rule derived from EqProp in this analog setting?
RQ4How does a deep analog architecture perform on standard tasks compared to software EqProp networks?
RQ5What are the practical considerations for implementing on hardware (memristors, diodes, amplifiers) for on-chip learning?

主要发现

Nonlinear resistive networks are energy-based models with an energy function arising from Kirchhoff’s laws.
The gradient w.r.t. a conductance can be estimated from the difference of squared voltage drops between nudged and free phases in the beta -> 0 limit.
A deep analog network architecture using crossbar resistor arrays and diode-based nonlinearities can be trained end-to-end with EqProp.
SPICE-based simulations on MNIST with 100 hidden neurons achieve 3.43% test error after 10 epochs, outperforming a comparable logistic regression baseline.
Compared to PyTorch EqProp implementations with equivalent hidden sizes, the SPICE-based network shows competitive or better performance under positive-weight constraints.

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。