Skip to main content
QUICK REVIEW

[论文解读] Learning Delays in Spiking Neural Networks using Dilated Convolutions with Learnable Spacings

Ilyass Hammouamri, Ismail Khalfaoui-Hassani|arXiv (Cornell University)|Jun 30, 2023
Advanced Memory and Neural Computing参考文献 52被引用 18
一句话总结

本文提出了一种离散时间反向传播方法,通过将延迟建模为带有可学习间距的一维时序卷积(DCLS),在深度前馈SNN中学习突触延迟,在时间基准SHD、SSC和GSC-35上实现最先进的结果,参数更少。

ABSTRACT

Spiking Neural Networks (SNNs) are a promising research direction for building power-efficient information processing systems, especially for temporal tasks such as speech recognition. In SNNs, delays refer to the time needed for one spike to travel from one neuron to another. These delays matter because they influence the spike arrival times, and it is well-known that spiking neurons respond more strongly to coincident input spikes. More formally, it has been shown theoretically that plastic delays greatly increase the expressivity in SNNs. Yet, efficient algorithms to learn these delays have been lacking. Here, we propose a new discrete-time algorithm that addresses this issue in deep feedforward SNNs using backpropagation, in an offline manner. To simulate delays between consecutive layers, we use 1D convolutions across time. The kernels contain only a few non-zero weights - one per synapse - whose positions correspond to the delays. These positions are learned together with the weights using the recently proposed Dilated Convolution with Learnable Spacings (DCLS). We evaluated our method on three datasets: the Spiking Heidelberg Dataset (SHD), the Spiking Speech Commands (SSC) and its non-spiking version Google Speech Commands v0.02 (GSC) benchmarks, which require detecting temporal patterns. We used feedforward SNNs with two or three hidden fully connected layers, and vanilla leaky integrate-and-fire neurons. We showed that fixed random delays help and that learning them helps even more. Furthermore, our method outperformed the state-of-the-art in the three datasets without using recurrent connections and with substantially fewer parameters. Our work demonstrates the potential of delay learning in developing accurate and precise models for temporal data processing. Our code is based on PyTorch / SpikingJelly and available at: https://github.com/Thvnvtos/SNN-delays

研究动机与目标

  • 激发并利用脉冲神经网络中的延迟学习,以增强时间模式处理。
  • 提出一种可微分方法,在深度SNN中将突触延迟与权重联合学习。
  • 展示一维时序卷积与连接延迟之间的等价性,并使用DCLS来学习延迟。
  • 在 SHD、SSC 和 GSC-35 上在没有循环连接的情况下,展示出更高的准确性且参数更少。

提出的方法

  • 将每个突触延迟表述为一个包含每个突触仅一个非零元素的1D时序卷积核。
  • 使用高斯插值的 DCLS 核来学习延迟位置,同时在训练期间逐步缩小高斯宽度(sigma)。
  • 用可学习的位置 d_ij 和共享的 sigma 表示延迟,允许通过延迟参数进行反向传播。
  • 将学习得到的连续卷积核转换为推理时的离散延迟,以实现稀疏、硬件友好的连接。
  • 使用带泄漏积分与发放(Leaky Integrate-and-Fire)神经元和离线的前馈架构进行代理梯度反向传播的训练。
  • 在 SHD、SSC 和 GSC-35 上,比较不同网络深度和参数数量下的最新方法。
Figure 1: Coincidence detection: we consider two neurons $N_{1}$ and $N_{2}$ with the same positive synaptic weight values. $N_{2}$ has a delayed synaptic connection denoted $d_{21}$ of $8$ ms, thus both spikes from spike train $S_{1}$ and $S_{2}$ will reach $N_{2}$ quasi-simultaneously. As a result
Figure 1: Coincidence detection: we consider two neurons $N_{1}$ and $N_{2}$ with the same positive synaptic weight values. $N_{2}$ has a delayed synaptic connection denoted $d_{21}$ of $8$ ms, thus both spikes from spike train $S_{1}$ and $S_{2}$ will reach $N_{2}$ quasi-simultaneously. As a result

实验结果

研究问题

  • RQ1在深度前馈SNN中,延迟能否与突触权重一起通过反向传播学习?
  • RQ2在时间脉冲模式基准上,学习延迟是否相对于固定或随机延迟具有显著的准确性提升?
  • RQ3就性能和参数效率而言,DCLS-Delays 方法与密集延迟表示有何比较?
  • RQ4在训练期间逐步收紧高斯核(sigma)对学习得到的延迟和整体准确性有何影响?

主要发现

DatasetMethodRec.Delays#ParamsTop1 Acc.
SHDEventProp-GeNNN/a84.80 ± 1.5%
SHDCuba-LIF0.14M87.80 ± 1.1%
SHDAdaptive SRNNN/a90.40%
SHDSNN+Delays0.1M90.43%
SHDTA-SNNN/a91.08%
SHDSTSC-SNN2.1M92.36%
SHDAdaptive Delays0.1M92.45%
SHDDL128-SNN-Dloss0.14M92.56%
SHDDense Conv Delays (ours)2.7M93.44%
SHDRadLIF3.9M94.62%
SHDDCLS-Delays (2L-1KC)0.2M95.07 ± 0.24%
SHDDCLS-Delays (2L-2KC)0.7M79.77 ± 0.09%
SHDDCLS-Delays (3L-1KC)1.2M80.29 ± 0.06%
SHDDCLS-Delays (3L-2KC)2.5M80.69 ± 0.21%
SSCRecurrent SNNN/a50.90 ± 1.1%
SSCHeter. RSNNN/a57.30%
SSCSNN-CNNN/a72.03%
SSCAdaptive SRNNN/a74.20%
SSCSpikGRU0.28M77.00 ± 0.4%
SSCRadLIF3.9M77.40%
SSCDense Conv Delays 2L10.9M77.86%
SSCDense Conv Delays 3L19M78.44%
SSCDCLS-Delays (2L-1KC)0.7M79.77 ± 0.09%
SSCDCLS-Delays (2L-2KC)1.4M80.16 ± 0.09%
SSCDCLS-Delays (3L-1KC)1.2M80.29 ± 0.06%
SSCDCLS-Delays (3L-2KC)2.5M80.69 ± 0.21%
GSC-35MSATN/a87.33%
GSC-35Dense Conv Delays 2L10.9M92.97%
GSC-35Dense Conv Delays 3L19M93.19%
GSC-35RadLIF1.2M94.51%
GSC-35DCLS-Delays (2L-1KC)0.7M94.91 ± 0.09%
GSC-35DCLS-Delays (2L-2KC)1.4M95.00 ± 0.06%
GSC-35DCLS-Delays (3L-1KC)1.2M95.29 ± 0.11%
GSC-35DCLS-Delays (3L-2KC)2.5M95.35 ± 0.04%
  • DCLS-Delays 在 2–3 个隐藏层时,在 SHD 上达到高达 95.07%,在 SSC 上达到 79.77–80.69%,在 GSC-35 上达到 94.91–95.35%,在各种配置下。
  • 学习得到的延迟优于固定/随机延迟,尤其在稀疏连接场景下。
  • 具有可学习间距的密集卷积延迟比标准密集延迟具有更高的准确性且使用更少的参数。
  • 该方法在 SHD、SSC 和 GSC-35 上实现最先进的结果,且没有超出 LIF 内部递归的循环连接。
  • 消融研究显示,联合学习权重和延迟并降低 sigma 相比恒定的 sigma 或固定延迟可提升性能。
Figure 2: Example of one neuron with 2 afferent synaptic connections, convolving $K1$ and $K2$ with the zero left-padded $S_{1}$ and $S_{2}$ is equivalent to following Equation 6
Figure 2: Example of one neuron with 2 afferent synaptic connections, convolving $K1$ and $K2$ with the zero left-padded $S_{1}$ and $S_{2}$ is equivalent to following Equation 6

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。