QUICK REVIEW

[논문 리뷰] Bullying10K: A Large-Scale Neuromorphic Dataset towards Privacy-Preserving Bullying Recognition

Yiting Dong, Yang Li|arXiv (Cornell University)|2023. 06. 20.

Adversarial Robustness in Machine Learning인용 수 9

한 줄 요약

Bullying10K는 프라이버시를 보호하는 괴롭힘/폭력 행동 인식을 위한 대규모 DVS 기반 데이터세트를 소개하며, 10,000개의 이벤트 세그먼트와 120억 개의 이벤트를 포함한 행동 인식, 시간적 행동 현지화, 포즈 추정 벤치마크를 제공합니다.

ABSTRACT

The prevalence of violence in daily life poses significant threats to individuals' physical and mental well-being. Using surveillance cameras in public spaces has proven effective in proactively deterring and preventing such incidents. However, concerns regarding privacy invasion have emerged due to their widespread deployment. To address the problem, we leverage Dynamic Vision Sensors (DVS) cameras to detect violent incidents and preserve privacy since it captures pixel brightness variations instead of static imagery. We introduce the Bullying10K dataset, encompassing various actions, complex movements, and occlusions from real-life scenarios. It provides three benchmarks for evaluating different tasks: action recognition, temporal action localization, and pose estimation. With 10,000 event segments, totaling 12 billion events and 255 GB of data, Bullying10K contributes significantly by balancing violence detection and personal privacy persevering. And it also poses a challenge to the neuromorphic dataset. It will serve as a valuable resource for training and developing privacy-protecting video systems. The Bullying10K opens new possibilities for innovative approaches in these domains.

연구 동기 및 목표

공공 감시 맥락에서 프라이버시를 보호하는 폭력 탐지의 필요성을 고취한다.
Dynamic Vision Sensors(DVS)로 캡처된 대규모 신경형 데이터세트를 제공한다.
이벤트 기반 데이터에서 행동 인식, 시간적 행동 현지화, 포즈 추정을 평가할 수 있도록 한다.
실세계 시나리오에서 프라이버시를 보호하는 비디오 시스템을 발전시키기 위한 벤치마크를 제공한다.

제안 방법

두 대의 Davis346 DVS 카메라를 사용하여 폭력행동과 우호적 행동의 다중 뷰 이벤트 스트림을 수집한다.
RGB 정렬 포즈 추정 도구에서 파생된 행동 범주, 카메라 위치, 조명, 포즈 키포인트로 데이터세트를 주석 처리한다.
원시 이벤트 스트림을 프레임 및 모델 입력용 10 ms 시간 단위로 변환한다.
DVS 데이터에서 여러 가지 행동 인식, 시간적 현지화, 포즈 추정 모델을 평가한다.
RGB와 DVS 데이터에서 프라이버시 보호 접근법을 비교하여 프라이버시-성능 트레이드오프를 평가한다.

Figure 1: Visualization of the Bullying10K dataset. For each example, the right section illustrates the stream of events captured by a Dynamic Visual Sensor (DVS) camera, showcasing the dynamic changes in brightness at each pixel. The left section demonstrates the related event frame transformed fro

실험 결과

연구 질문

RQ1대규모 이벤트 기반 데이터세트가 프라이버시를 보존하면서도 복잡하고 빠르며 가려진 폭력 행동을 포착할 수 있는가?
RQ2최신 행동 인식, 현지화, 포즈 추정 방법들이 신경형 DVS 데이터에서 폭력 탐지를 위해 얼마나 잘 작동하는가?
RQ3폭력 현장 인식에서 RGB 대 DVS 모달리티에 프라이버시 보호 기술을 적용할 때의 프라이버시 영향과 성능 트레이드오프는 무엇인가?

주요 결과

Bullying10K에는 10,000개의 이벤트 세그먼트가 포함되어 있으며 총 12 billion events와 255 GB의 데이터가 제공된다.
데이터셋은 10개의 동작(6개의 폭력, 4개의 우호)을 대상으로 하며 세 가지 벤치마크를 제공한다: 행동 인식, 시간적 행동 현지화, 그리고 포즈 추정.
DVS 기반 행동 인식은 조명 변화와 모션 블러에 대한 강건성을 보이며, 프라이버시 보호 설정에서 RGB 유도 기준선보다 종종 더 뛰어난 성능을 보인다.
Bullying10K에서의 시간적 행동 현지화와 포즈 추정은 데이터세트의 복잡성과 특화된 이벤트 기반 모델의 필요성을 반영하여 상당한 도전을 제시한다.
분석에는 이벤트 동역학과 가려짐을 특징짓기 위한 키포인트 모션, 이벤트 양극성 분포, IoU 분포가 포함된다.

Figure 2: The flow of the data acquisition process. We employed two DVS cameras, positioned on the left and right sides, respectively. Following the recording, the DVS outputs an event stream for pre-processing. This processed data was then employed for three distinct tasks: action recognition, temp

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.