QUICK REVIEW

[논문 리뷰] So2Sat LCZ42: A Benchmark Dataset for Global Local Climate Zones Classification

Xiao Xiang Zhu, Jingliang Hu|arXiv (Cornell University)|2019. 12. 19.

Land Use and Ecosystem Services참고 문헌 42인용 수 53

한 줄 요약

본 논문은 So2Sat LCZ42를 제시합니다. 전 세계적으로 분산된 오픈 벤치마크 데이터셋으로 약 ~400k Sentinel-1/2 이미지 패치에 17개의 Local Climate Zone (LCZ) 클래스가 라벨링되어 있으며, 엄격한 라벨링 워크플로우와 기준 분류 결과가 함께 제공됩니다.

ABSTRACT

Access to labeled reference data is one of the grand challenges in supervised machine learning endeavors. This is especially true for an automated analysis of remote sensing images on a global scale, which enables us to address global challenges such as urbanization and climate change using state-of-the-art machine learning techniques. To meet these pressing needs, especially in urban research, we provide open access to a valuable benchmark dataset named "So2Sat LCZ42," which consists of local climate zone (LCZ) labels of about half a million Sentinel-1 and Sentinel-2 image patches in 42 urban agglomerations (plus 10 additional smaller areas) across the globe. This dataset was labeled by 15 domain experts following a carefully designed labeling work flow and evaluation process over a period of six months. As rarely done in other labeled remote sensing dataset, we conducted rigorous quality assessment by domain experts. The dataset achieved an overall confidence of 85%. We believe this LCZ dataset is a first step towards an unbiased globallydistributed dataset for urban growth monitoring using machine learning methods, because LCZ provide a rather objective measure other than many other semantic land use and land cover classifications. It provides measures of the morphology, compactness, and height of urban areas, which are less dependent on human and culture. This dataset can be accessed from http://doi.org/10.14459/2018mp1483140.

연구 동기 및 목표

전 세계적으로 분산된 고품질 LCZ 라벨링 데이터셋을 제공하여 전이 가능한 도시 LCZ 분류 모델을 가능하게 한다.
신뢰할 수 있는 라벨을 달성하기 위한 엄격한 라벨링 워크플로우와 품질 평가를 확립한다.
기계 학습 실험을 위한 코어레지스터된 SAR/광학 이미지 패치(Sentinel-1/2)에 대한 오픈 액세스를 제공한다.

제안 방법

대륙에 걸친 42개의 대도시 군집과 10개의 소도시를 큐레이션하고 LCZ 폴리곤을 수동으로 라벨링한다.
LCZ 라벨을 Sentinel-1 SAR 및 Sentinel-2 다중분광 패치에 코어등록한다(320m x 320m 패치).
학습, 라벨링, 시각적 검증, 정량적 검증의 네 단계 라벨링 워크플로우를 구현한다.
전문가 다수결로 폴리곤 및 픽셀 수준 라벨 검증을 수행하여 라벨 신뢰도를 추정한다(전반적으로 85%로 보고).
ML 사용을 위해 래스터화하기 전 포스트 프로세싱(폴리곤 축소 및 클래스 균형 맞춤)으로 클래스 샘플의 균형을 맞춘다.
Sentinel-2 특징에서 기반 분류기(RF, SVM, ResNeXt-CBAM)를 제공한다.

실험 결과

연구 질문

RQ1전 세계적으로 분산되고 전문가가 라벨링한 LCZ 데이터세트를 만들어 일반화 가능한 LCZ 분류기를 학습시킬 수 있는가?
RQ2여러 전문가에 의해 검증될 때 LCZ 주석의 라벨 품질과 불확실성은 어느 정도인가?
RQ3일반적인 ML 방법을 사용하여 LCZ42 데이터세트에서 달성할 수 있는 기본 분류 성능은 어느 정도인가?

주요 결과

Classifier	OA	WA	AA	Kappa
RF	0.51	0.87	0.31	0.46
SVM	0.54	0.88	0.36	0.49
ResNeXt-CBAM	0.61	0.92	0.51	0.58

LCZ42 데이터세트는 52개 도시에서 LCZ 라벨이 있는 400,673개의 Sentinel-1/2 패치 쌍으로 구성된다.
다수결 정제를 거친 후 인간 라벨링 신뢰도는 약 85%이다.
Sentinel-2 특징에서의 기준선 결과는 RF OA=0.51, SVM OA=0.54, 그리고 ResNeXt-CBAM OA=0.61이다.
가중 정확도(WA) 값은 0.87(RF), 0.88(SVM), 0.92(ResNeXt-CBAM)이다.
평균 정확도(AA) 값은 0.31(RF), 0.36(SVM), 0.51(ResNeXt-CBAM)이다.
카파 계수는 0.46(RF), 0.49(SVM), 0.58(ResNeXt-CBAM)이다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.