[Paper Review] STPLS3D: A Large-Scale Synthetic and Real Aerial Photogrammetry 3D Point Cloud Dataset
This paper presents STPLS3D, a large-scale dataset combining synthetic and real aerial photogrammetry 3D point clouds, generated via a fully automatic pipeline that uses open geospatial data to produce richly annotated data and demonstrates synthetic data benefits for real-world 3D semantic segmentation and generalization.
Although various 3D datasets with different functions and scales have been proposed recently, it remains challenging for individuals to complete the whole pipeline of large-scale data collection, sanitization, and annotation. Moreover, the created datasets usually suffer from extremely imbalanced class distribution or partial low-quality data samples. Motivated by this, we explore the procedurally synthetic 3D data generation paradigm to equip individuals with the full capability of creating large-scale annotated photogrammetry point clouds. Specifically, we introduce a synthetic aerial photogrammetry point clouds generation pipeline that takes full advantage of open geospatial data sources and off-the-shelf commercial packages. Unlike generating synthetic data in virtual games, where the simulated data usually have limited gaming environments created by artists, the proposed pipeline simulates the reconstruction process of the real environment by following the same UAV flight pattern on different synthetic terrain shapes and building densities, which ensure similar quality, noise pattern, and diversity with real data. In addition, the precise semantic and instance annotations can be generated fully automatically, avoiding the expensive and time-consuming manual annotation. Based on the proposed pipeline, we present a richly-annotated synthetic 3D aerial photogrammetry point cloud dataset, termed STPLS3D, with more than 16 $km^2$ of landscapes and up to 18 fine-grained semantic categories. For verification purposes, we also provide a parallel dataset collected from four areas in the real environment. Extensive experiments conducted on our datasets demonstrate the effectiveness and quality of the proposed synthetic dataset.
Motivation & Objective
- Motivate and enable end-to-end creation of large-scale annotated photogrammetry 3D point clouds.
- Provide a fully automatic, controllable pipeline to generate photorealistic synthetic data aligned with real-world UAV workflows.
- Balance class distribution and enable automatic semantic and instance annotations.
- Demonstrate the usefulness of synthetic data for improving real-world 3D semantic segmentation and generalization.
Proposed method
- Procedural generation of 3D city blocks using GIS data (OSM footprints, road networks, DSM) and CGA-based CityEngine models for diverse building styles.
- Rendering 2D images in Unreal Engine 4 with weather effects to simulate realistic photogrammetry inputs, followed by reconstruction with ContextCapture.
- Automatic transfer of 2D labels to 3D photogrammetry points via a proxy ray-casted point cloud and nearest-neighbor labeling with ground-connected components to improve alignment.
- Automatic semantic and instance annotations generated as byproducts of the rendering and reconstruction pipeline.
- Large-scale synthetic datasets (SyntheticV1, SyntheticV2, SyntheticV3) covering ~16 km2, with up to 18 semantic classes and 14 instance classes, plus four real-world sites (USC, WMSC, OCCC, RA).
- Evaluation framework integrating state-of-the-art 3D semantic segmentation and instance segmentation baselines on STPLS3D.
Experimental results
Research questions
- RQ1Can a fully automatic synthetic data generation pipeline produce photorealistic, domain-appropriate 3D aerial photogrammetry data with automatic annotations?
- RQ2Do synthetic STPLS3D data improve real-world 3D semantic segmentation performance and generalization when used for training?
- RQ3What are the limitations and domain gaps between synthetic and real outdoor photogrammetry data, and how do they affect object-level tasks?
- RQ4How does mixing real and synthetic data affect performance for large-scale outdoor scenes?
- RQ5What is the relative annotation quality and cost savings when using automatic annotations versus manual labeling?
Key findings
- The STPLS3D dataset combines >16 km2 of synthetic terrain with a 1.27 km2 real subset across 4 real sites, with up to 18 semantic and 14 instance classes.
- All tested baselines improve when trained on synthetic data compared to real data alone, indicating synthetic data provides beneficial diversity and scale.
- Training with real + synthetic data yields the best mIoU across baselines (e.g., KPConv gains nearly 8 percentage points in mIoU).
- Synthetic data enhances generalization: PointTransformer’s mIoU improves by about 13% when augmented with synthetic data on FDc cross-dataset evaluation.
- Instance segmentation on the synthetic subset shows HAIS outperforms PointGroup but remains challenging for outdoor-scale scenes, highlighting outdoor-domain gaps versus indoor datasets.
- The proposed pipeline enables automatic, scalable generation of richly annotated 3D photogrammetry data with comparable noise characteristics to real data, reducing annotation costs.
Better researchstarts right now
From paper design to paper writing, dramatically reduce your research time.
No credit card · Free plan available
This review was created by AI and reviewed by human editors.