QUICK REVIEW

[論文レビュー] Speck: A Smart event-based Vision Sensor with a low latency 327K Neuron Convolutional Neuronal Network Processing Pipeline

Ole Richter, Yannan Xing|arXiv (Cornell University)|Apr 13, 2023

Advanced Memory and Neural Computing参考文献 49被引用数 8

ひとこと要約

Speck1 は 327K-neuron sCNN パイプラインを備えたオンチップの非同期イベントベース視覚センサーで、イベントあたり 3.36 µs の遅延とエッジ視覚タスクの高スループットを実現します。

ABSTRACT

Edge computing solutions that enable the extraction of high-level information from a variety of sensors is in increasingly high demand. This is due to the increasing number of smart devices that require sensory processing for their application on the edge. To tackle this problem, we present a smart vision sensor System on Chip (SoC), featuring an event-based camera and a low-power asynchronous spiking Convolutional Neural Network (sCNN) computing architecture embedded on a single chip. By combining both sensor and processing on a single die, we can lower unit production costs significantly. Moreover, the simple end-to-end nature of the SoC facilitates small stand-alone applications as well as functioning as an edge node in larger systems. The event-driven nature of the vision sensor delivers high-speed signals in a sparse data stream. This is reflected in the processing pipeline, which focuses on optimising highly sparse computation and minimising latency for 9 sCNN layers to 3.36μs for an incoming event. Overall, this results in an extremely low-latency visual processing pipeline deployed on a small form factor with a low energy budget and sensor cost. We present the asynchronous architecture, the individual blocks, and the sCNN processing principle and benchmark against other sCNN capable processors.

研究の動機と目的

エッジにおける高速・低電力の感覚処理のニーズを動機づける。
イベントベースのカメラと低電力の非同期 sCNN プロセッサを統合した単一チップシステムを実証する。
リアルタイム視覚タスクの超低遅延と高いスパース性駆動の計算を実現する。
遅延、スループット、およびエネルギー効率の観点で Speck1 を他の sCNN プロセッサと比較評価する。

提案手法

128x128 のイベントベース Vision Pixel センサを Temporal Contrast エンコードで単一の ASIC 上に設計・実装する。
メモリ内計算と 4-phase ハンドシェイクを用いた QDI DR エンコードによる 9 層の非同期スパイキング CNN (sCNN) パイプラインを開発する。
NoC（Network on Chip）を星型トポロジで統合し、AER イベントを最大 2 つの宛先へ競合を最小化してルーティングする。
ROI、ポーリング、回転/反転、極性フィルタリング、ソースマッピングのためのセンサイベント前処理ブロックを作成する。
カーネルアンカー、アドレススイープ、In-Memory ニューロン計算ユニットを備えた LIF様ダイナミクスの畳み込みコアを実装する。
アサイン可能な非同期-同期インターフェースを備えたマルチチャネルスパイク/クラスカウントおよび時間ベース統計を読み出すコアを提供する。

実験結果

リサーチクエスチョン

RQ1専用の sCNN パイプラインを備えた統合イベントベースセンサは超低遅延を達成し、リアルタイム視覚タスクで実用的な精度を維持できるか。
RQ2非同期・イベント駆動アーキテクチャはエッジ視覚アプリケーションの遅延、スループット、エネルギー効率においてフレームベースの CNN アクセラレータと比較してどのようになるか。
RQ3スパースイベント処理のためのオンチップシナプス記憶とリアルタイムカーネル計算は、面積と電力の観点から見て、他の大規模 SNN プロセッサ（例：Loihi1/2）と比較してどのような利点があるか。

主な発見

ASIC 入出力エッジで 9 層の conv+pooling sCNN によってイベントあたりのレイテンシが 3.36 µs に達する。
アーキテクチャはニューロン計算ユニットあたり約 30 M events/s のスループットをサポートし、スパースなイベントストリームに対して高い並列性を実現する。
スパイク変換された NMNIST ベンチマークで、Speck1 は ANN2SNN でオンチップ精度 86.17%、オフラインの BPTT-CNN トレーニング regime で 98.56% を達成。
フレームベースの CNN アクセラレータと比較して、Speck1 はイベントベース入力に対してはるかに低遅延を実現し、全フレームバッチ処理を必要としないリアルタイム処理を提供する。
シナプティックメモリとオンザ fly カーネル計算は、いくつかの大型 SNN プロセッサ（例：Loihi1/2）と比較して、同程度または低い面積/エネルギー予算でシナプス利用率を高める。
Speck1 の推論あたりの報告エネルギーは、評価された構成で約 141 µJ (ANN2SNN) および 180 µJ (BPTT-CNN) の範囲である。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。