QUICK REVIEW

[論文レビュー] DublinCity: Annotated LiDAR Point Cloud and its Applications

S. M. Iman Zolanvari, Susana Ruano|arXiv (Cornell University)|Sep 6, 2019

Remote Sensing and LiDAR Applications参考文献 34被引用数 50

ひとこと要約

A densely annotated, city-scale ALS LiDAR dataset for Dublin with 260 million labeled points across 13 classes, used to train CNNs for 3D object classification and to evaluate image-based 3D reconstruction against LiDAR ground truth.

ABSTRACT

Scene understanding of full-scale 3D models of an urban area remains a challenging task. While advanced computer vision techniques offer cost-effective approaches to analyse 3D urban elements, a precise and densely labelled dataset is quintessential. The paper presents the first-ever labelled dataset for a highly dense Aerial Laser Scanning (ALS) point cloud at city-scale. This work introduces a novel benchmark dataset that includes a manually annotated point cloud for over 260 million laser scanning points into 100'000 (approx.) assets from Dublin LiDAR point cloud [12] in 2015. Objects are labelled into 13 classes using hierarchical levels of detail from large (i.e., building, vegetation and ground) to refined (i.e., window, door and tree) elements. To validate the performance of our dataset, two different applications are showcased. Firstly, the labelled point cloud is employed for training Convolutional Neural Networks (CNNs) to classify urban elements. The dataset is tested on the well-known state-of-the-art CNNs (i.e., PointNet, PointNet++ and So-Net). Secondly, the complete ALS dataset is applied as detailed ground truth for city-scale image-based 3D reconstruction.

研究の動機と目的

Create a manually annotated, city-scale LiDAR point cloud for Dublin that provides dense, hierarchical labels across urban elements.
Enable rigorous evaluation of 3D object classification using state-of-the-art CNNs on real-world, outdoor ALS data.
Provide a ground truth reference for assessing image-based 3D reconstruction against dense LiDAR measurements.

提案手法

Manually label over 260 million points from a 1.4 billion-point Dublin ALS dataset into ~100,000 assets across 13 classes with three hierarchical levels.
Use CloudCompare to segment and label data, refining from coarse (building/ground/vegetation/undefined) to fine (roof facade, doors, windows).
Train and evaluate three CNN-based models (PointNet, PointNet++, SO-Net) on 3982 labeled objects across 5 classes (doors, windows, façades, roofs, trees).
Apply COLMAP to generate image-based reconstructions from two image sets (top-view and oblique aerial images) and register them to LiDAR via GPS priors and ICP refinement.
Compare image-based reconstructions to the LiDAR ground truth using precision, recall, and F-score metrics per tile.
Provide the dataset publicly for community use and future per-class evaluation in tasks like segmentation, GIS analysis, and urban modeling.

実験結果

リサーチクエスチョン

RQ1How dense and accurate can a city-scale manually labeled ALS LiDAR dataset be made for urban elements?
RQ2How effective are modern CNNs (PointNet, PointNet++, SO-Net) at classifying urban elements using real, dense LiDAR data?
RQ3How well can image-based 3D reconstructions approximate dense LiDAR ground truth across city-scale scenes?

主な発見

ポイント数	平均クラス	全体	平均クラス	全体	平均クラス	全体
512	24.17	35.17	39.47	45.56	41.89	48.74
1024	38.84	50.13	44.65	62.91	45.73	63.54
2048	46.77	59.68	49.23	63.42	49.34	64.55
4096	48.77	60.68	51.23	64.42	50.34	65.55

The DublinCity dataset contains approximately 260 million labeled points across 100,000 assets in 13 hierarchical classes with an average density of about 348.43 points/m^2.
SO-Net achieved the best overall/classification performance among the tested models (PointNet, PointNet++, SO-Net) as filed in Table 1 across varying point counts per object.
Classification scores improve with more input points per object (512–4096), with SO-Net reaching an Overall accuracy of 65.55% at 4096 points.
Image-based reconstructions (top-view and oblique) produce dense point clouds, with the LiDAR dataset remaining over four times denser; oblique views yielded ground-truth distances closer on average in most tiles.
Precision, recall, and F-score analyses show tile-dependent performance for image-based reconstructions, with oblique imagery often closer to ground truth than top-view.

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。