Skip to main content
QUICK REVIEW

[论文解读] Traveling Waves Encode the Recent Past and Enhance Sequence Learning

T. Anderson Keller, Lyle Muller|arXiv (Cornell University)|Sep 3, 2023
Neural dynamics and brain function被引用 10
一句话总结

本文介绍 Wave-RNN (wRNN),一种最小化的 RNN 模型,其隐藏状态支持行进波,编码最近的过去,在长序列任务上比无波的 RNN 学得更快、性能更好,相比 LSTM/GRU 具有竞争力。它在合成记忆任务与序列图像分类基准上验证了波的优势。

ABSTRACT

Traveling waves of neural activity have been observed throughout the brain at a diversity of regions and scales; however, their precise computational role is still debated. One physically inspired hypothesis suggests that the cortical sheet may act like a wave-propagating system capable of invertibly storing a short-term memory of sequential stimuli through induced waves traveling across the cortical surface, and indeed many experimental results from neuroscience correlate wave activity with memory tasks. To date, however, the computational implications of this idea have remained hypothetical due to the lack of a simple recurrent neural network architecture capable of exhibiting such waves. In this work, we introduce a model to fill this gap, which we denote the Wave-RNN (wRNN), and demonstrate how such an architecture indeed efficiently encodes the recent past through a suite of synthetic memory tasks where wRNNs learn faster and reach significantly lower error than wave-free counterparts. We further explore the implications of this memory storage system on more complex sequence modeling tasks such as sequential image classification and find that wave-based models not only again outperform comparable wave-free RNNs while using significantly fewer parameters, but additionally perform comparably to more complex gated architectures such as LSTMs and GRUs.

研究动机与目标

  • 推动并验证假设:行进波可以在波场记忆中存储最近的序列信息。
  • 开发一个能够自然呈现行进波的最小化 RNN 架构。
  • 在合成任务和标准长序列基准上展示波动力学的记忆与序列学习优势。

提出的方法

  • 通过将递归整形为离散一维波动方程,使用圆形隐藏状态布局来定义 Wave-RNN (wRNN)。
  • 使用卷积递归算子 (u),模拟移位 (Sigma),在隐藏通道中产生行进波。
  • 采用 ReLU 激活、分通道隐藏状态,以及特定初始化:带 Toeplitz/对齐移位的 u-shift 以及对 V 的稀疏单位初始化,以实现波驱动的记忆。
  • 与具有最小波动力学的 iRNN 基线进行比较,以 isolating 行进波的影响。
  • 通过对隐藏激活的二维傅里叶变换分析波的出现,验证行进波结构。
Figure 1 : Illustration of three input signals (top) and a corresponding wave-field with induced traveling waves (bottom). From an instantaneous snapshot of the wave-field at each timestep we are able decode both the time of onset and input channel of each input spike. Furthermore, subsequent spikes
Figure 1 : Illustration of three input signals (top) and a corresponding wave-field with induced traveling waves (bottom). From an instantaneous snapshot of the wave-field at each timestep we are able decode both the time of onset and input channel of each input spike. Furthermore, subsequent spikes

实验结果

研究问题

  • RQ1最小 RNN 中的行进波动力学是否比无波的递归结构更有效地编码最近的过去?
  • RQ2波基模型是否能泛化到比合成记忆测试更长的序列和更复杂的任务?
  • RQ3Wave-RNN 与标准门控结构(LSTM/GRU)在长序列基准上如何比较?
  • RQ4哪些关键架构组件(卷积递归、初始化)使得波的传播更加鲁棒?

主要发现

  • 在 Copy 任务中,wRNN 相对于匹配的无波基线,在 T={0,30,80} 的不同序列长度下实现了>5 个数量级的损失下降。
  • 在 Long Sequence Addition 任务中,wRNN 收敛更快,能够解决更长的序列(最高到 1000 步),优于 iRNN。
  • 对于序列化图像任务 (sMNIST、psMNIST、nsCIFAR10),wRNN 训练更快、性能优于无波模型,并且在与 LSTM/GRU 及其他门控架构相比具有竞争力或更优。
  • 消融实验表明,u-shift 初始化对实现长程波记忆的作用最大;V 初始化主要加速收敛。
  • 可视化结果确认 wRNN 的隐藏状态中存在行进波模式,而 iRNN 基线中则不存在。
Figure 2 : Visualization of hidden state (top) and associated 2D Fourier transform (bottom) for a wRNN (left) and iRNN (right) operating on the sMNIST task. We see the Wave-RNN exhibits a clear flow of activity across the hidden state (diagonal bands) while the iRNN does not.
Figure 2 : Visualization of hidden state (top) and associated 2D Fourier transform (bottom) for a wRNN (left) and iRNN (right) operating on the sMNIST task. We see the Wave-RNN exhibits a clear flow of activity across the hidden state (diagonal bands) while the iRNN does not.

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。