QUICK REVIEW

[論文レビュー] Spike-driven Transformer V2: Meta Spiking Neural Network Architecture Inspiring the Design of Next-generation Neuromorphic Chips

Man Yao, Jiakui Hu|arXiv (Cornell University)|Feb 15, 2024

Advanced Memory and Neural Computing被引用数 19

ひとこと要約

本論文ではメタ-スパイクフォーマー Meta-SpikeFormer を導入する。これはスパイク駆動型のメタTransformerベースのSNNで、ImageNet-1K において 80.0% の top-1 精度を達成し、パラメータ数は 55M で、単一の直接訓練SNNバックボーンで分類・検出・セグメンテーションを実現する。さらにニューロモルフィックチップ設計への示唆についても論じる。

ABSTRACT

Neuromorphic computing, which exploits Spiking Neural Networks (SNNs) on neuromorphic chips, is a promising energy-efficient alternative to traditional AI. CNN-based SNNs are the current mainstream of neuromorphic computing. By contrast, no neuromorphic chips are designed especially for Transformer-based SNNs, which have just emerged, and their performance is only on par with CNN-based SNNs, offering no distinct advantage. In this work, we propose a general Transformer-based SNN architecture, termed as ``Meta-SpikeFormer", whose goals are: 1) Lower-power, supports the spike-driven paradigm that there is only sparse addition in the network; 2) Versatility, handles various vision tasks; 3) High-performance, shows overwhelming performance advantages over CNN-based SNNs; 4) Meta-architecture, provides inspiration for future next-generation Transformer-based neuromorphic chip designs. Specifically, we extend the Spike-driven Transformer in \citet{yao2023spike} into a meta architecture, and explore the impact of structure, spike-driven self-attention, and skip connection on its performance. On ImageNet-1K, Meta-SpikeFormer achieves 80.0\% top-1 accuracy (55M), surpassing the current state-of-the-art (SOTA) SNN baselines (66M) by 3.7\%. This is the first direct training SNN backbone that can simultaneously supports classification, detection, and segmentation, obtaining SOTA results in SNNs. Finally, we discuss the inspiration of the meta SNN architecture for neuromorphic chip design. Source code and models are available at \url{https://github.com/BICLab/Spike-Driven-Transformer-V2}.

研究の動機と目的

Motivate energy-efficient neuromorphic computation by leveraging spike-driven SNNs for Transformer-based vision models.
Develop a meta Transformer architecture that combines Conv-based SNN blocks with Transformer-based SNN blocks under spike-driven constraints.
Improve performance and versatility of SNNs to surpass CNN-based SNNs on standard vision benchmarks.
Provide architectural guidance to inspire future Transformer-based neuromorphic chip design.

提案手法

Extend Spike-driven Transformer into a meta-architecture (Meta-SpikeFormer) that uses Conv-based SNN blocks in early stages and Pyramid Transformer-based SNN stages in later ones.
Introduce spike-driven self-attention (SDSA) operators with no softmax or scaling and linear or near-linear computational complexity in N.
Design micro-level blocks: Conv-based SNN blocks using SepConv and ChannelConv; Transformer-based SNN blocks generating Q_S, K_S, V_S via RepConv-based encoding and applying SDSA.
Employ four-stage architecture with a pyramid structure and a higher-channel stage (Stage 4) to control parameters.
Explore three shortcut variants (Vanilla, SEW, Membrane Shortcut) with MS (Membrane Shortcut) delivering strong accuracy.
Train directly (surrogate gradients) to enable end-to-end learning for SNN backbones on static and event-based data.

実験結果

リサーチクエスチョン

RQ1Can a meta Transformer-based SNN architecture unify high performance with low power under spike-driven constraints?
RQ2How do architectural choices (Conv vs Transformer blocks, SDSA variants, and shortcuts) affect accuracy, parameters, and power in spike-driven SNNs?
RQ3Can a direct-training SNN backbone handle classification, detection, and segmentation tasks concurrently?
RQ4What design principles from Meta-SpikeFormer can guide future Transformer-based neuromorphic chip development?

主な発見

Meta-SpikeFormer achieves 80.0% top-1 accuracy on ImageNet-1K with 55M parameters (T=4 achieves 80.0% after distillation pretraining).
It surpasses the current SOTA SNN baselines by 3.7 percentage points with 17% fewer parameters (55M vs 66M).
It is the first direct-training SNN backbone able to handle classification, detection, and segmentation simultaneously and achieves SOTA results in SNNs on tested datasets.
On ImageNet-1K, Meta-SpikeFormer demonstrates state-of-the-art performance in the SNN domain across multiple tasks and analyses show advantages over Conv-based SNNs in both accuracy and versatility.
A meta-architecture design with Conv-based and Transformer-based SNN blocks, SDSA, and Membrane Shortcut provides practical guidance for future neuromorphic chip design.

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。