QUICK REVIEW

[论文解读] TCJA-SNN: Temporal-Channel Joint Attention for Spiking Neural Networks

Ruijie Zhu, Malu Zhang|arXiv (Cornell University)|Jun 21, 2022

Advanced Memory and Neural Computing被引用 26

一句话总结

TCJA-SNN 为基于 LIF 的 SNN 引入一个时空通道联合注意力模块，使用 Temporal-wise Local Attention、Channel-wise Local Attention 和 Cross Convolutional Fusion 来共同建模时间和通道依赖性，提升分类性能并实现高质量的尖峰生成。

ABSTRACT

Spiking Neural Networks (SNNs) are attracting widespread interest due to their biological plausibility, energy efficiency, and powerful spatio-temporal information representation ability. Given the critical role of attention mechanisms in enhancing neural network performance, the integration of SNNs and attention mechanisms exhibits potential to deliver energy-efficient and high-performance computing paradigms. We present a novel Temporal-Channel Joint Attention mechanism for SNNs, referred to as TCJA-SNN. The proposed TCJA-SNN framework can effectively assess the significance of spike sequence from both spatial and temporal dimensions. More specifically, our essential technical contribution lies on: 1) We employ the squeeze operation to compress the spike stream into an average matrix. Then, we leverage two local attention mechanisms based on efficient 1D convolutions to facilitate comprehensive feature extraction at the temporal and channel levels independently. 2) We introduce the Cross Convolutional Fusion (CCF) layer as a novel approach to model the inter-dependencies between the temporal and channel scopes. This layer breaks the independence of these two dimensions and enables the interaction between features. Experimental results demonstrate that the proposed TCJA-SNN outperforms SOTA by up to 15.7% accuracy on standard static and neuromorphic datasets, including Fashion-MNIST, CIFAR10-DVS, N-Caltech 101, and DVS128 Gesture. Furthermore, we apply the TCJA-SNN framework to image generation tasks by leveraging a variation autoencoder. To the best of our knowledge, this study is the first instance where the SNN-attention mechanism has been employed for image classification and generation tasks. Notably, our approach has achieved SOTA performance in both domains, establishing a significant advancement in the field. Codes are available at https://github.com/ridgerchu/TCJA.

研究动机与目标

激发并利用基于 LIF 的 SNN 中的时域和通道信息联合，以提升表征学习和准确性。
开发一个轻量级注意力模块，可以在不进行大量再训练的情况下插入到现有 SNN 中。
提出以低参数开销融合时域与通道线索的机制。
在神经形态数据集和 Fashion-MNIST 上证明在分类和生成任务上的有效性。

提出的方法

将尖峰流压缩成大小为 C x T 的平均矩阵 Z，以捕捉时域-通道相关性。
引入 Temporal-wise Local Attention (TLA)，在 Z 上沿时间轴使用一维卷积。
引入 Channel-wise Local Attention (CLA)，在 Z 上沿通道轴使用一维卷积。
通过 Cross Convolutional Fusion (CCF) 计算 F = sigmoid(T ∘ C)，其中 T 和 C 分别是 TLA 和 CLA 的输出，对时域和通道显著性进行融合。
在基于尖峰的训练中使用 ATan 和三角形状的代理函数进行反向传播。
使用 Spike Mean-Square-Error (SMSE) 和 Temporal Efficient Training (TET) 损失来优化分类。

实验结果

研究问题

RQ1一个联合时域-通道注意力机制是否能在尖峰神经网络中超越仅时序注意力来提高特征判别能力？
RQ2提出的 TCJA 模块是否提供了一种参数高效的方式来建模 SNN 的时空依赖？
RQ3TCJA-SNN 能否在神经形态和静态数据集上，使用二进制和非二进制尖峰达到最先进的准确性？
RQ4TCJA 在 SNN 的高层次分类和低层次生成任务中是否有效？

主要发现

在 Fashion-MNIST、CIFAR10-DVS、N-Caltech 101 和 DVS128 Gesture 的二进制尖峰分类中，TCJA-SNN 在静态和神经形态数据集上的分类准确率领先于现有方法，最高提升达 15.7%。
在 DVS128 Gesture 上，TCJA-SNN 以 20 时间步达到 99.0% 的准确率，优于 TA-SNN 且步数更少。
在 N-Caltech 101 上，TCJA-SNN 以 14 时间步达到 78.5% 的准确率，相较于之前的最佳结果有显著提升。
TCJA 还实现了一个完全尖峰的变分自编码器（FSVAE）用于图像生成，取得具有竞争力的 Inception 分数以及相对于基线的有利 FID/FAD 指标。
消融实验显示 CLA 有显著贡献，且 Cross Convolutional Fusion (CCF) 对实现时域-通道联合增益至关重要。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。