QUICK REVIEW

[논문 리뷰] RC-GeoCP: Geometric Consensus for Radar-Camera Collaborative Perception

Xiaokai Bai, Lianqing Zheng|arXiv (Cornell University)|2026. 02. 28.

Adversarial Robustness in Machine Learning인용 수 0

한 줄 요약

RC-GeoCP는 radar-anchored 기하학적 합의를 radar-camera 협력 인식에 도입하여 Geometric Structure Rectification, Uncertainty-Aware Communication, 및 Consensus-Driven Assembler를 활용해 V2X-Radar 및 V2X-R 벤치마크에서 통신을 줄이면서 최첨단 성능을 달성합니다.

ABSTRACT

Collaborative perception (CP) enhances scene understanding through multi-agent information sharing. While LiDAR-centric systems offer precise geometry, high costs and performance degradation in adverse weather necessitate multi-modal alternatives. Despite dense visual semantics and robust spatial measurements, the synergy between cameras and 4D radar remains underexplored in collaborative settings. This work introduces RC-GeoCP, the first framework to explore the fusion of 4D radar and images in CP. To resolve misalignment caused by depth ambiguity and spatial dispersion across agents, RC-GeoCP establishes a radar-anchored geometric consensus. Specifically, Geometric Structure Rectification (GSR) aligns visual semantics with geometry derived from radar to generate spatially grounded, geometry-consistent representations. Uncertainty-Aware Communication (UAC) formulates selective transmission as a conditional entropy reduction process to prioritize informative features based on inter-agent disagreement. Finally, the Consensus-Driven Assembler (CDA) aggregates multi-agent information via shared geometric anchors to form a globally coherent representation. We establish the first unified radar-camera CP benchmark on V2X-Radar and V2X-R, demonstrating state-of-the-art performance with significantly reduced communication overhead. Code will be released soon.

연구 동기 및 목표

LiDAR 중심 설정을 넘는 견고한 다중 모달 협력 인식을 동기화합니다.
레이더로부터 도출된 기하를 활용해 카메라 의미를 지탱하고 깊이로 인한 정렬 오류를 줄입니다.
대역폭 한계 하에서 정보 이득을 극대화하기 위한 선택적이고 불확실성 인식적인 통신을 개발합니다.
공유 레이다 앵커를 사용한 합의 기반 집계를 제안하여 전역적으로 일관된 융합을 실현합니다.

제안 방법

Geometric Structure Rectification (GSR) 은 sparse radar cues가 guiding하는 deformable cross-attention을 통해 카메라 BEV 특징을 레이다 기하에 맞춥니다.
Uncertainty-Aware Communication (UAC)은 ego-centric demand maps를 계산하고 상위-K의 정보 토큰을 선택하여 대역폭 사용을 줄입니다.
Learnable agent-wise tokens는 비선정 특징에 교차 주의를 기울여 잔류 맥락을 보존하고 정보 손실을 완화합니다.
Consensus-Driven Assembler (CDA)는 radar-derived geometric consensus를 어텐션 로짓에 주입해 에이전트 간 물리적으로 근거 있는 융합을 강제합니다.
Multi-scale fusion은 다양한 스케일에서 전송된 토큰과 함께 보정된 특징을 모아 일관된 협력 BEV 표현을 생성합니다.

실험 결과

연구 질문

RQ1레이더에서 도출된 기하학적 단서는 레이다-카메라 CP에서 깊이 불명확성 및 시점 간 정렬 문제를 완화하는 안정적인 앵커로 작용할 수 있을까요?
RQ2현실적인 대역폭 제약 하에서 불확실성 인식형 수요 주도 통신이 성능을 향상시킬까요?
RQ3레이더 기반 기하학적 합의가 다중 에이전트 토큰 집계를 개선하여 전역적으로 일관된 인식을 얻을 수 있을까요?
RQ4RC-GeoCP는 기존의 레이다-전용, 카메라-전용, 레이다-카메라 기반 벤치마크 대비 통일된 레이다-카메라 CP 벤치마크에서 어떻게 수행되나요?

주요 결과

RC-GeoCP는 V2X-Radar val에서 최첨단 성능을 달성합니다: AP@0.5 = 44.55, AP@0.7 = 25.92.
V2X-Radar test에서 AP@0.5 = 42.61, AP@0.7 = 18.77.
V2X-R, val에서 AP@0.5 = 81.90, AP@0.7 = 65.09.
2.39 단위의 통신으로도 경쟁력 있는 정확성을 달성하여 강력한 효율 향상을 시사합니다.
RC-GeoCP는 여러 백본에서 비교 가능한 레이다-카메라 융합 방식보다 지속적으로 우수한 성능을 보이며 중거리(30–50 m)에서 특히 큰 개선을 보입니다.
프레임 간 비정합(비동기 설정)에서도 강건성을 유지하면서도 상당한 성능 향상을 제공합니다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.