QUICK REVIEW

[논문 리뷰] TCJA-SNN: Temporal-Channel Joint Attention for Spiking Neural Networks

Ruijie Zhu, Malu Zhang|arXiv (Cornell University)|2022. 06. 21.

Advanced Memory and Neural Computing인용 수 26

한 줄 요약

TCJA-SNN은 LIF 기반 SNN에 대해 시간-채널 결합 모듈을 도입하고, Temporal-wise Local Attention, Channel-wise Local Attention, Cross Convolutional Fusion을 사용하여 시간과 채널 의존성을 공동으로 모델링하고, 분류 성능을 높이며 고품질 스파이크 생성이 가능해진다.

ABSTRACT

Spiking Neural Networks (SNNs) are attracting widespread interest due to their biological plausibility, energy efficiency, and powerful spatio-temporal information representation ability. Given the critical role of attention mechanisms in enhancing neural network performance, the integration of SNNs and attention mechanisms exhibits potential to deliver energy-efficient and high-performance computing paradigms. We present a novel Temporal-Channel Joint Attention mechanism for SNNs, referred to as TCJA-SNN. The proposed TCJA-SNN framework can effectively assess the significance of spike sequence from both spatial and temporal dimensions. More specifically, our essential technical contribution lies on: 1) We employ the squeeze operation to compress the spike stream into an average matrix. Then, we leverage two local attention mechanisms based on efficient 1D convolutions to facilitate comprehensive feature extraction at the temporal and channel levels independently. 2) We introduce the Cross Convolutional Fusion (CCF) layer as a novel approach to model the inter-dependencies between the temporal and channel scopes. This layer breaks the independence of these two dimensions and enables the interaction between features. Experimental results demonstrate that the proposed TCJA-SNN outperforms SOTA by up to 15.7% accuracy on standard static and neuromorphic datasets, including Fashion-MNIST, CIFAR10-DVS, N-Caltech 101, and DVS128 Gesture. Furthermore, we apply the TCJA-SNN framework to image generation tasks by leveraging a variation autoencoder. To the best of our knowledge, this study is the first instance where the SNN-attention mechanism has been employed for image classification and generation tasks. Notably, our approach has achieved SOTA performance in both domains, establishing a significant advancement in the field. Codes are available at https://github.com/ridgerchu/TCJA.

연구 동기 및 목표

LIF 기반 SNN에서 시간과 채널 정보를 결합해 표현 학습 및 정확도를 향상시키려는 동기 부여 및 활용.
무거운 재학습 없이 기존 SNN에 플러그인할 수 있는 경량 주의 모듈 개발.
매개변수 오버헤드가 낮은 방식으로 시간 신호와 채널 신호를 융합하는 메커니즘 제안.
뉴로모픽 데이터셋과 Fashion-MNIST를 사용하여 분류 및 생성 태스크에서의 효과 시연.

제안 방법

Spike 스트림을 크기 C x T인 평균 행렬 Z로 압축해 시간-채널 상관관계를 포착.
Z에서 시간 축을 따라 1-D 합성곱을 사용하여 Temporal-wise Local Attention (TLA)을 도입.
Z에서 채널 축을 따라 1-D 합성곱을 사용하여 Channel-wise Local Attention (CLA)을 도입.
Cross Convolutional Fusion (CCF)로 T와 C의 합성에 의해 F = sigmoid(T ∘ C)를 계산해 시간 및 채널 주목도를 융합.
spike 기반 학습에서 역전파를 위해 ATan과 삼각형 모양의 대체 함수를 사용.
분류 최적화를 위해 Spike Mean-Square-Error (SMSE) 및 Temporal Efficient Training (TET) 손실로 학습.

실험 결과

연구 질문

RQ1시간-채널 결합 주의 메커니즘이 시간만의 주의보다 SNN에서 특징 판별을 향상시킬 수 있는가?
RQ2제안된 TCJA 모듈이 SNN의 시공간 의존성을 매개변수 효율적으로 모델링하는 방법을 제공하는가?
RQ3TCJA-SNN이 이진 스파이크와 비이진 스파이크를 사용하여 뉴로모픽 및 정적 데이터 세트에서 최첨단 정확도를 달성할 수 있는가?
RQ4TCJA가 SNN에서 고수준 분류 및 저수준 생성 태스크에 효과적인가?

주요 결과

TCJA-SNN은 Fashion-MNIST, CIFAR10-DVS, N-Caltech 101, DVS128 Gesture에서 이진 스파이크를 사용하여 정적 및 뉴로모픽 데이터 세트에서 분류 정확도 기준으로 최대 15.7%p 앞서 있다.
DVS128 Gesture에서 TCJA-SNN은 20 타임스텝으로 99.0% 정확도에 도달하여 더 적은 스텝으로 TA-SNN을 능가한다.
N-Caltech 101에서 TCJA-SNN은 14 타임스텝으로 78.5% 정확도를 달성해 이전 최고치 대비 큰 향상을 나타낸다.
TCJA는 이미지 생성을 위한 완전 스파이킹 변형 오토인코더(FSVAE)도 가능하게 하며, 베이스라인과 비교해 경쟁력 있는 Inception Scores 및 우수한 FID/FAD 지표를 달성한다.
특성 분석은 CLA의 기여가 상당하며 Cross Convolutional Fusion (CCF)이 시간-채널 동시 이득을 달성하는 데 핵심임을 보여준다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.