QUICK REVIEW

[論文レビュー] KITTI-CARLA: a KITTI-like dataset generated by CARLA Simulator

Jean‐Emmanuel Deschaud|arXiv (Cornell University)|Aug 17, 2021

Advanced Vision and Imaging参考文献 2被引用数 34

ひとこと要約

Seven sequences were generated, each with 5000 frames across seven CARLA maps (Town01–Town07).

ABSTRACT

KITTI-CARLA is a dataset built from the CARLA v0.9.10 simulator using a vehicle with sensors identical to the KITTI dataset. The vehicle thus has a Velodyne HDL64 LiDAR positioned in the middle of the roof and two color cameras similar to Point Grey Flea 2. The positions of the LiDAR and cameras are the same as the setup used in KITTI. The objective of this dataset is to test approaches of semantic segmentation LiDAR and/or images, odometry LiDAR and/or image in synthetic data and to compare with the results obtained on real data like KITTI. This dataset thus makes it possible to improve transfer learning methods from a synthetic dataset to a real dataset. We created 7 sequences with 5000 frames in each sequence in the 7 maps of CARLA providing different environments (city, suburban area, mountain, rural area, highway...). The dataset is available at: http://npm3d.fr

研究の動機と目的

セマンティックセグメンテーション、インスタンスセグメンテーション、およびオドメトリ姿勢のグラウンドトゥルースを備えた現実的な合成データセットを作成する。
CARLAにKITTI風のセンサ配置（LiDARとカメラ）を提供し、実KITTIデータとの比較を容易にする。
自動運転タスクにおける合成データから実データへの転移学習の評価を可能にする。

提案手法

KITTIセンサ構成に合わせたVelodyne HDL-64E LiDARと2つのカラーカメラを搭載した車両を、CARLA v0.9.10を用いてシミュレートする。
データを1000Hzで取得するため、固定タイムステップ0.001sでシミュレートし、LiDARにロールシャッター風の効果を生成する。
7つのCARLAマップ（Town01–Town07）にわたる7つのシーケンスを記録し、1シーケンスあたり5000フレーム、LiDARの全変換とタイムスタンプを含む。
セマンティックセグメンテーションとインスタンスセグメンテーションのグラウンドトゥルース注釈に加え、オドメトリ姿勢を提供し、毎ステップの LiDAR データから点の世界座標を計算するPythonツールを提供する。
データ利活用を支援するため、カメラ内部パラメータ、外部パラメータ、および LiDAR-カメラの較正をASCIIファイルに文書化する。

実験結果

リサーチクエスチョン

RQ1KITTI-likeセンサを用いたCARLAで生成した合成データは、セマンティックセグメンテーションとオドメトリ課題において実KITTIデータを近似できるか。
RQ2KITTI-CARLAで学習した転移学習手法は、実KITTIデータへどの程度転移できるか。
RQ3CARLAの環境（city, suburban, mountain, rural, highway）の違いが知覚タスクに与える影響は何か。
RQ4提供されるグラウンドトゥルースモダリティ（semantic/instance segmentation, poses）は何で、ベンチマークにどの程度有用か。

主な発見

Seven sequences were generated, each with 5000 frames across seven CARLA maps (Town01–Town07).
The dataset provides full LiDAR poses at 1000Hz and corresponding camera timestamps, enabling precise ground-truth reconstruction.
Two cameras (1392×1024, 72° FOV) and a vertically mounted HDL-64E-like LiDAR with 64 channels were used to mirror KITTI sensor setup.
Ground-truth annotations for semantic segmentation and instance segmentation are included, along with odometry poses for each frame.
Calibration data between LiDAR and cameras are saved and accessible for data utilization.

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。