QUICK REVIEW

[논문 리뷰] Efficient Event Camera Volume System

Juan Camilo Soto, Ian Wilfred Noronha|arXiv (Cornell University)|2026. 03. 16.

Advanced Memory and Neural Computing인용 수 0

한 줄 요약

EECVS는 이벤트 스트림을 연속 시간 Dirac 임펄스에 대해 모델링하고 DCT, DTFT, DWT 중 하나를 계수 가지치기로 적응적으로 선택하여 아티팩트 없는, 밀도 인지(density-aware) 압축과 실시간 배포를 달성하며, 교차 데이터셋 전반에 걸친 일반화를 보인다.

ABSTRACT

Event cameras promise low latency and high dynamic range, yet their sparse output challenges integration into standard robotic pipelines. We introduce ameframew (Efficient Event Camera Volume System), a novel framework that models event streams as continuous-time Dirac impulse trains, enabling artifact-free compression through direct transform evaluation at event timestamps. Our key innovation combines density-driven adaptive selection among DCT, DTFT, and DWT transforms with transform-specific coefficient pruning strategies tailored to each domain's sparsity characteristics. The framework eliminates temporal binning artifacts while automatically adapting compression strategies based on real-time event density analysis. On EHPT-XC and MVSEC datasets, our framework achieves superior reconstruction fidelity with DTFT delivering the lowest earth mover distance. In downstream segmentation tasks, EECVS demonstrates robust generalization. Notably, our approach demonstrates exceptional cross-dataset generalization: when evaluated with EventSAM segmentation, EECVS achieves mean IoU 0.87 on MVSEC versus 0.44 for voxel grids at 24 channels, while remaining competitive on EHPT-XC. Our ROS2 implementation provides real-time deployment with DCT processing achieving 1.5 ms latency and 2.7X higher throughput than alternative transforms, establishing the first adaptive event compression framework that maintains both computational efficiency and superior generalization across diverse robotic scenarios.

연구 동기 및 목표

동적이고 대비가 큰 환경에서 이벤트 카메라를 활용한 견고한 인식의 동기를 부여한다.
다양한 장면 밀도에 맞춰 적응적인 변환 기반 압축 프레임워크를 개발한다.
연속 시간으로 이벤트를 모델링하여 시간 구간화로 인한 아티팩트를 제거한다.
여러 로봇 공학 데이터 세트에 걸친 실시간 배포 및 평가를 가능하게 한다.
압축 표현이 하류 작업으로의 일반화를 평가한다.

제안 방법

시간 구간화 아티팩트를 피하기 위해 이벤트 스트림을 연속 시간 Dirac 임펄스 열로 모델링한다.
윈도우된 이벤트 밀도에 따라 DCT, DTFT, DWT 간의 밀도 기반 변환 선택을 도입한다.
Dirac 임펄스 모델을 사용한 변환 원소와의 내적을 통해 계수를 계산한다 (c_w,k = sum_i p_i φ_k(t_i)).
윈도우당 고정 예산 M의 계수를 남기고 변환별 가지치기를 수행한다( DCT: 저주파 유지; DTFT/ DWT: 가장 큰 진폭 계수).
유지된 계수를 표준 인식 파이프라인용으로 조밀한 표현으로 패킹한다.

Figure 1: Event-to-dense representation in EECVS. Incoming event streams are processed within the framework and converted into compact dense representations through the application of DCT, DTFT, or DWT.

실험 결과

연구 질문

RQ1실시간 이벤트 밀도에 기반한 적응적 변환 선택이 이벤트 카메라 스트림의 압축 품질과 효율성을 향상시킬 수 있는가?
RQ2희박한, 중간, 밀집한 이벤트 상황에서 시간 정확도와 공간 디테일 보존 측면에서 DCT, DTFT, DWT는 어떻게 비교되는가?
RQ3밀도 기반 압축과 그로부터 얻은 표현이 하류 작업과 데이터셋 간에 잘 일반화되는가?

주요 결과

대부분의 재구성 시나리오에서 DTFT가 최저 Earth Mover Distance를 달성한다(아홉 중 여덟).
DCT 처리의 대기시간이 가장 짧고(1.5 ms) M=8 계수에서 DTFT 또는 DWT에 비해 약 2.7배 높은 처리량을 보인다.
24 채널 MVSEC에서 EECVS의 평균 IoU는 0.87이고 보셀 격자(voxel grids)는 0.44로, 강한 데이터셋 간 일반화를 보여준다.
EHPT-XC에서도 EECVS는 여전히 경쟁력이 있으며 보셀 표현에 비해 IoU 차이가 7 포인트 이내이면서 계산상의 이점을 제공한다.
DTFT는 다양한 장면에서 견고한 시간적 충실도를 제공하고, 희박한 패턴에서는 DWT가, 밀집한 활동에서는 효율성을 위해 DCT가 선호된다.
DTFT 선택은 여덟 실험에서 가장 작은 EMD를 산출하고, 전체 IoU는 채널 예산에 따라 안정적으로 유지된다(16채널 및 24채널에서 0.82).

Figure 2: Compression process for a single event window. Events are aggregated, transformed with a basis selected according to activity density, pruned by either low-frequency retention (DCT) or magnitude selection (DTFT/DWT), and packed into dense representations.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.