[论文解读] SpiNNaker2: A Large-Scale Neuromorphic System for Event-Based and Asynchronous Machine Learning
SpiNNaker2 提供了一个可扩展的数字神经形态平台,面向事件驱动和异步机器学习,能够在成千上万的芯片和一个5-million-core system中实现大规模的 SNN/ANN/混合模型。
The joint progress of artificial neural networks (ANNs) and domain specific hardware accelerators such as GPUs and TPUs took over many domains of machine learning research. This development is accompanied by a rapid growth of the required computational demands for larger models and more data. Concurrently, emerging properties of foundation models such as in-context learning drive new opportunities for machine learning applications. However, the computational cost of such applications is a limiting factor of the technology in data centers, and more importantly in mobile devices and edge systems. To mediate the energy footprint and non-trivial latency of contemporary systems, neuromorphic computing systems deeply integrate computational principles of neurobiological systems by leveraging low-power analog and digital technologies. SpiNNaker2 is a digital neuromorphic chip developed for scalable machine learning. The event-based and asynchronous design of SpiNNaker2 allows the composition of large-scale systems involving thousands of chips. This work features the operating principles of SpiNNaker2 systems, outlining the prototype of novel machine learning applications. These applications range from ANNs over bio-inspired spiking neural networks to generalized event-based neural networks. With the successful development and deployment of SpiNNaker2, we aim to facilitate the advancement of event-based and asynchronous algorithms for future generations of machine learning systems.
研究动机与目标
- 为大型模型和边缘/边缘云应用提供对密集型 GPU 为中心的 ML 范式的能效高、可扩展替代方案进行动机说明。
- 描述 SpiNNaker2 在事件驱动处理和异步计算方面的架构、互连和核心能力。
- 演示 ANN、SNN 及事件驱动混合网络在 SpiNNaker2 上的映射,并概述初步的算法方法(训练与推理)。
- 展示减少数据移动并支持动态电压/频率调整的算法与架构策略。
提出的方法
- 描述 SpiNNaker2 芯片架构:每片芯片 152 个 ARM Cortex M4F 核心,节点拥有 2GB DRAM。
- 解释片上网络(NoC)、分组路由,以及实现可扩展、异步事件通信的 6- 邻链接。
- 讨论核心级加速(例如 8/16 位矩阵运算、随机数生成等)以及节能的动态电压/时钟调整。
- 提出使用具有有限状态机的单芯片调度器和去中心化数据流的 ANN 执行调度方法。
- 引入深度重连(deep rewiring)用于稀疏到稀疏的训练,以实现内存高效的学习。
- 概述通过事件驱动处理映射与执行 SNN 的流程,包括将 NIR 作为 SNN 的交换格式。
- 描述事件驱动学习规则(EventProp、e-prop)及其在 SpiNNaker2 上的实现,包括多播路由和批量并行训练。
实验结果
研究问题
- RQ1Can SpiNNaker2 provide competitive energy efficiency for training and inference of event-based and asynchronous ML models compared to traditional dense accelerators?
- RQ2How can ANNs, SNNs, and hybrid event-based networks be mapped and executed at scale on a SpiNNaker2-based fabric?
- RQ3What learning algorithms (e.g., EventProp, e-prop, deep rewiring) are viable on SpiNNaker2 hardware, and what are their performance/energy profiles?
- RQ4What architectural and software strategies enable scalable, real-time operation for large-scale neuromorphic workloads?
主要发现
- SpiNNaker2 enables large-scale event-based and asynchronous ML on a system conceived for up to millions of cores reaching about 5 million processing elements.
- An energy-efficient 2D/6-neighbour chip interconnect supports scalable, real-time operation with dynamic voltage and frequency scaling at the core level.
- Demonstrated memory-efficient sparse-to-sparse training (deep rewiring) achieving notable MNIST accuracy with very constrained memory (64 kB) and low connectivity, with on-chip energy savings.
- Event-based backpropagation (EventProp) can be implemented on SpiNNaker2 using multi-cast routing to perform gradient-based learning for multi-layer SNNs.
- E-prop (online learning for SRNNs) on SpiNNaker2 achieves competitive accuracy (e.g., 91.12%) under real-time conditions with reduced memory usage.
- The system supports neuromorphic intermediate representation (NIR) and tools (py-spinnaker2) to train and deploy SNNs on SpiNNaker2 across multiple cores.
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。