QUICK REVIEW

[論文レビュー] Bullying10K: A Large-Scale Neuromorphic Dataset towards Privacy-Preserving Bullying Recognition

Yiting Dong, Yang Li|arXiv (Cornell University)|Jun 20, 2023

Adversarial Robustness in Machine Learning被引用数 9

ひとこと要約

Bullying10K は、プライバシーを保護したいじめ/暴力行動認識の大規模DVSベースデータセットを紹介します。行動認識、時系列行動位置推定、およびポーズ推定のベンチマークを、1万のイベントセグメントと120億イベントで提供します。

ABSTRACT

The prevalence of violence in daily life poses significant threats to individuals' physical and mental well-being. Using surveillance cameras in public spaces has proven effective in proactively deterring and preventing such incidents. However, concerns regarding privacy invasion have emerged due to their widespread deployment. To address the problem, we leverage Dynamic Vision Sensors (DVS) cameras to detect violent incidents and preserve privacy since it captures pixel brightness variations instead of static imagery. We introduce the Bullying10K dataset, encompassing various actions, complex movements, and occlusions from real-life scenarios. It provides three benchmarks for evaluating different tasks: action recognition, temporal action localization, and pose estimation. With 10,000 event segments, totaling 12 billion events and 255 GB of data, Bullying10K contributes significantly by balancing violence detection and personal privacy persevering. And it also poses a challenge to the neuromorphic dataset. It will serve as a valuable resource for training and developing privacy-protecting video systems. The Bullying10K opens new possibilities for innovative approaches in these domains.

研究の動機と目的

公共監視文脈でのプライバシー保護型暴力検出を動機づける。
Dynamic Vision Sensors（DVS）で捉えられた大規模なニューロモルフィックデータセットを提供する。
イベントベースデータ上での行動認識、時系列行動局在化、ポーズ推定の評価を可能にする。
実世界のシナリオでプライバシー保護ビデオシステムを前進させるベンチマークを提供する。

提案手法

暴力行動と友好的な行動のマルチビューイベントストリームを収集するために2つのDavis346 DVSカメラを使用する。
RGBアラインドポーズ推定ツールから導出された行動カテゴリ、カメラ位置、照明、ポーズキーポイントでデータセットに注釈を付ける。
RAWイベントストリームをフレームと10msの時間単位に変換してモデル入力とする。
DVSデータ上で複数の行動認識、時系列局在化、ポーズ推定モデルを評価する。
RGB対DVSデータのプライバシー保護アプローチを比較し、プライバシーと性能のトレードオフを評価する。

Figure 1: Visualization of the Bullying10K dataset. For each example, the right section illustrates the stream of events captured by a Dynamic Visual Sensor (DVS) camera, showcasing the dynamic changes in brightness at each pixel. The left section demonstrates the related event frame transformed fro

実験結果

リサーチクエスチョン

RQ1大規模なイベントベースデータセットが、プライバシーを保護しつつ、複雑で高速かつ遮蔽のある暴力行動を捉えることができるか。
RQ2最先端の行動認識、局在化、およびポーズ推定手法は、暴力検出のためのニューロモルフィックDVSデータ上でどのように機能するか。
RQ3暴力シーン認識においてRGB対DVSのモダリティにプライバシー保護技術を適用する際のプライバシー影響と性能トレードオフは何か。

主な発見

Bullying10K は10,000のイベントセグメント、総計120億イベント、データ量は255 GB。
データセットは10のアクション（6つの暴力、4つの友好）を対象とし、3つのベンチマークを提供する：行動認識、時系列行動局在化、ポーズ推定。
DVSベースの行動認識は照明変化と動きブレに対して頑健で、プライバシー保護設定下ではRGB由来のベースラインを上回ることが多い。
Bullying10Kにおける時系列行動局在化とポーズ推定は本データセットの複雑さと、イベントベースのモデルの専門性の必要性を反映して大きな課題を呈する。
解析にはキーポイントのモーション、イベント極性分布、IoU分布が含まれ、イベントダイナミクスと遮蔽を特徴づける。

Figure 2: The flow of the data acquisition process. We employed two DVS cameras, positioned on the left and right sides, respectively. Following the recording, the DVS outputs an event stream for pre-processing. This processed data was then employed for three distinct tasks: action recognition, temp

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。