QUICK REVIEW

[论文解读] No Classification without Representation: Assessing Geodiversity Issues in Open Data Sets for the Developing World

Shreya Shankar, Yoni Halpern|arXiv (Cornell University)|Nov 22, 2017

Advanced Image and Video Retrieval Techniques参考文献 5被引用 175

一句话总结

本文分析了 ImageNet 与 Open Images 的地理多样性，揭示美洲中心/欧洲中心的偏见及其对跨区域分类器性能的影响。它主张为发展中国家应用建立地理代表性的数据集。

ABSTRACT

Modern machine learning systems such as image classifiers rely heavily on large scale data sets for training. Such data sets are costly to create, thus in practice a small number of freely available, open source data sets are widely used. We suggest that examining the geo-diversity of open data sets is critical before adopting a data set for use cases in the developing world. We analyze two large, publicly available image data sets to assess geo-diversity and find that these data sets appear to exhibit an observable amerocentric and eurocentric representation bias. Further, we analyze classifiers trained on these data sets to assess the impact of these training distributions and find strong differences in the relative performance on images from different locales. These results emphasize the need to ensure geo-representation when constructing data sets for use in the developing world.

研究动机与目标

评估两个大型开放图像数据集（ImageNet 与 Open Images）的地理多样性。
评估在这些数据集上进行训练如何影响来自不同地理位置的图像的分类器性能。
证明在广泛使用的数据集中存在美洲中心/欧洲中心的表示偏差。
讨论发展中国家数据集构建的含义与影响。

提出的方法

使用按国家级的地理定位代理来估算 ImageNet 与 Open Images 的地理分布。
分析各国的图像分布并识别表示不平衡。
在两个数据集上训练并评估预训练的 Inception V3 模型，以比较在地理本地化图像上的性能。
通过众包和地理定位的网页图像方法收集压力测试数据，以评估跨地区的分类器行为。
使用显著性图（SmoothGrad）来考察哪些图像区域驱动错误分类。

实验结果

研究问题

RQ1Do ImageNet and Open Images exhibit geo-representation biases across countries?
RQ2How does geographic bias in training data affect classifier performance on non-US images?
RQ3Are misclassifications influenced more by attire or context in region-specific images?
RQ4Do classifiers trained on these datasets perform consistently across different geographic locales?

主要发现

Open Images 与 ImageNet 显示出显著的美国和欧洲偏向，而中国和印度代表性不足。
大量样本来自北美和欧洲最具代表性的六个国家。
在这些数据集上训练的分类器对区域特定图像的错误分类更频繁，对非美国图像的置信度也较低。
显著性图表明某些错误分类中模型依赖面部区域，而非着装。
来自海得拉巴的压力测试图像在两种模型下往往具有较低的概率，表明存在区域性能差距。
不同国家之间的性能差异表明在图像类别上的表示并非均匀。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。