QUICK REVIEW

[论文解读] Poverty Mapping Using Convolutional Neural Networks Trained on High and Medium Resolution Satellite Images, With an Application in Mexico

Boris Babenko, Jonathan Hersh|arXiv (Cornell University)|Nov 16, 2017

Impact of Light on Environment and Health参考文献 3被引用 48

一句话总结

本文提出在高分辨率和中等分辨率的卫星图像上训练卷积神经网络（CNNs），以估算墨西哥市镇层面的贫困水平。通过将CNN预测结果与Planet影像的用地分类相结合，模型在10%的验证样本中解释了高达57%的贫困差异，证明了使用卫星数据进行端到端贫困制图的可行性。

ABSTRACT

Mapping the spatial distribution of poverty in developing countries remains an important and costly challenge. These "poverty maps" are key inputs for poverty targeting, public goods provision, political accountability, and impact evaluation, that are all the more important given the geographic dispersion of the remaining bottom billion severely poor individuals. In this paper we train Convolutional Neural Networks (CNNs) to estimate poverty directly from high and medium resolution satellite images. We use both Planet and Digital Globe imagery with spatial resolutions of 3-5 sq. m. and 50 sq. cm. respectively, covering all 2 million sq. km. of Mexico. Benchmark poverty estimates come from the 2014 MCS-ENIGH combined with the 2015 Intercensus and are used to estimate poverty rates for 2,456 Mexican municipalities. CNNs are trained using the 896 municipalities in the 2014 MCS-ENIGH. We experiment with several architectures (GoogleNet, VGG) and use GoogleNet as a final architecture where weights are fine-tuned from ImageNet. We find that 1) the best models, which incorporate satellite-estimated land use as a predictor, explain approximately 57% of the variation in poverty in a validation sample of 10 percent of MCS-ENIGH municipalities; 2) Across all MCS-ENIGH municipalities explanatory power reduces to 44% in a CNN prediction and landcover model; 3) Predicted poverty from the CNN predictions alone explains 47% of the variation in poverty in the validation sample, and 37% over all MCS-ENIGH municipalities; 4) In urban areas we see slight improvements from using Digital Globe versus Planet imagery, which explain 61% and 54% of poverty variation respectively. We conclude that CNNs can be trained end-to-end on satellite imagery to estimate poverty, although there is much work to be done to understand how the training process influences out of sample validation.

研究动机与目标

开发一种可扩展、成本效益高的方法，利用卫星图像和深度学习生成高分辨率的贫困地图。
评估卷积神经网络是否能够直接从卫星图像中估计贫困水平，而无需依赖外部社会经济指标。
比较不同卫星数据源（Planet，3–5米分辨率；Digital Globe，50厘米分辨率）在预测贫困方面的表现。
评估将从卫星图像中提取的用地分类作为额外特征引入贫困预测模型的影响。
研究模型在训练市镇以外区域的泛化性能，特别是非MCS-ENIGH区域的表现。

提出的方法

在墨西哥全部200万平方公里范围内，使用高分辨率（3–5米）的Planet和中等分辨率（50厘米）的Digital Globe卫星图像，对深度卷积神经网络（CNNs）进行训练。
采用迁移学习方法，通过在GoogleNet架构上微调预训练的ImageNet权重，由于领域偏移问题，排除了近红外波段。
将从Planet影像中提取的用地分类作为辅助输入，以提升模型性能。
在2014年MCS-ENIGH调查的896个市镇上训练模型，并在10%的保留样本中进行验证，同时对全部2,456个市镇进行全样本评估。
使用预测贫困率与基于2015年人口普查和MCS-ENIGH调查数据得出的基准贫困率之间的R²来评估模型性能。
比较多种网络架构（GoogleNet和VGG变体）及数据模态（仅RGB，含与不含近红外波段），根据内部开发集的表现选择最佳配置。

实验结果

研究问题

RQ1在卫星图像上端到端训练的卷积神经网络能否准确预测墨西哥市镇层面的贫困率？
RQ2从卫星图像中提取的用地分类的引入，如何影响基于CNN的贫困预测模型的预测性能？
RQ3卫星图像的分辨率和覆盖范围（如Planet与Digital Globe）是否显著影响贫困估计的准确性？
RQ4为何当模型应用于训练集外的市镇（非MCS-ENIGH区域）时，性能显著下降？
RQ5CNN在城市与农村区域之间的泛化能力如何？两者之间的性能表现有何差异？

主要发现

表现最佳的模型结合了CNN预测结果与Planet影像的用地分类，解释了MCS-ENIGH市镇10%验证样本中57%的贫困率变化。
在全部2,456个MCS-ENIGH市镇上评估时，模型的解释力下降至44%，表明泛化性能存在显著下降。
仅使用CNN预测时，在10%验证样本中解释了47%的贫困率变化，在全部MCS-ENIGH市镇中解释了37%的变化。
在城市地区，Digital Globe影像的性能优于Planet影像（R² = 0.61 vs. R² = 0.54），表明更高分辨率有助于提升城市贫困估计的准确性。
在非MCS-ENIGH市镇中，性能显著降低，总体R²值降至0.28，表明模型在样本外的泛化能力较差。
训练时引入近红外波段并未提升性能，因此被排除，可能是因为ImageNet的RGB单通道分布与实际卫星数据存在领域偏移。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。