QUICK REVIEW

[논문 리뷰] Zenseact Open Dataset: A large-scale and diverse multimodal dataset for autonomous driving

Mina Alibeigi, William Ljungbergh|arXiv (Cornell University)|2023. 05. 03.

Advanced Neural Network Applications인용 수 11

한 줄 요약

Zenseact Open Dataset (ZOD)는 유럽의 대규모 다중모달 자율주행 데이터세트로, 고해상도 센서, 장거리 인식에 대한 풍부한 주석, 그리고 다양한 작업을 지원하기 위한 Frames, Sequences, Drives를 포함하는 허가적인 CC BY-SA 4.0 라이선스를 갖추고 있습니다.

ABSTRACT

Existing datasets for autonomous driving (AD) often lack diversity and long-range capabilities, focusing instead on 360° perception and temporal reasoning. To address this gap, we introduce Zenseact Open Dataset (ZOD), a large-scale and diverse multimodal dataset collected over two years in various European countries, covering an area 9x that of existing datasets. ZOD boasts the highest range and resolution sensors among comparable datasets, coupled with detailed keyframe annotations for 2D and 3D objects (up to 245m), road instance/semantic segmentation, traffic sign recognition, and road classification. We believe that this unique combination will facilitate breakthroughs in long-range perception and multi-task learning. The dataset is composed of Frames, Sequences, and Drives, designed to encompass both data diversity and support for spatio-temporal learning, sensor fusion, localization, and mapping. Frames consist of 100k curated camera images with two seconds of other supporting sensor data, while the 1473 Sequences and 29 Drives include the entire sensor suite for 20 seconds and a few minutes, respectively. ZOD is the only large-scale AD dataset released under a permissive license, allowing for both research and commercial use. More information, and an extensive devkit, can be found at https://zod.zenseact.com

연구 동기 및 목표

다양하고 장거리 자율주행 데이터가 360도 인식 데이터셋을 넘어선 필요성에 대응한다.
다양한 인식 작업을 위한 다중 모달 센서 데이터와 광범위한 주석을 제공한다.
Frames, Sequences, Drives 하위 구분을 통해 AD에서 다중 작업 학습과 도메인 적응을 가능하게 한다.

제안 방법

센서 구성 및 2년간 유럽 전역의 데이터 수집 설정을 설명한다.
서로 다른 의도된 작업을 갖는 세 가지 데이터 카테고리(Frames, Sequences, Drives)를 정의한다.
semantic/instance segmentation, 2D/3D bounding boxes, 및 road/tSign 레이블에 대한 수작업적이고 계층적인 주석을 제공한다.
얼굴과 번호판을 두 가지 방법(블러링과 DNAT)으로 익명화하고, 익명화 영향 연구를 위해 두 버전을 모두 공개한다.
빠른 실험을 촉진하기 위해 광범위한 개발 키트를 포함한 CC BY-SA 4.0 하에 데이터셋을 출시한다.

Figure 1 : Geographical coverage comparison with other AD datasets using the diversity area metric defined in [ 27 ] (top left), and geographical distribution of ZOD Frames overlaid on the map. The numbers in the quantized regions represent the amount of annotated frames in that geographical region.

실험 결과

연구 질문

RQ1ZOD는 기존 AD 데이터세트와 비교하여 얼마나 다양하고 지리적으로 확장되어 있는가?
RQ2고해상도 센서와 장거리 주석이 강건한 장거리 인식 및 다중 작업 학습을 가능하게 하는가?
RQ3익명화 기술이 ZOD에서 학습된 다운스트림 비전 모델에 미치는 영향은 무엇인가?
RQ4Frame, Sequence, Drive 하위 세트가 지각, 위치추정, 매핑, 계획 등의 다양한 AD 작업을 어떻게 지원하는가?

주요 결과

ZOD Frames는 14개의 유럽 국가를 포괄하며 자율 포즈 기준값 75 m를 사용할 때 705,000 m^2(7.05e5 km^2)의 다양성 영역을 나타내어 강한 지리적 다양성을 시사한다.
대상은 최대 245 미터까지 주석화되며, 8MP 전면 카메라와 옥상 LiDAR가 결합되어 같은 크기의 데이터세트에서 흔치 않은 고범위 인식을 제공한다.
교통 표지판 분류 체계는 156개 클래스와 446k 개의 주석 인스턴스를 포함하여 상응하는 데이터세트 중 표지판 인스턴스 면에서 가장 큰 규모를 나타낸다.
익명화 실험은 익명화된 데이터로 학습된 Faster-RCNN 및 YOLOv7 검출기가 원본 이미지에서 평가될 때 통계적으로 유의미한 성능 저하가 없음을 보여준다.
3D 객체 탐지는 거리 의존적 성능을 보이며 CenterPoint는 0–150 m에서 mAP 0.25로, 150–250 m에서 0.01로 떨어져 장거리 인식 과제를 강조한다(표 4).
교통 표지판 인식은 일반 표지판에서 강력한 성능을 보이나 희귀 표지판의 경우 롱테일 문제를 뚜렷하게 나타낸다(매크로 F1이 마이크로 F1보다 낮다).

Figure 2 : Placement of sensors used by data collection vehicles in ZOD and their corresponding coordinate systems.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.