QUICK REVIEW

[论文解读] Assessing Dataset Bias in Computer Vision

Athiya Deviyani|arXiv (Cornell University)|Jan 1, 2021

Domain Adaptation and Few-Shot Learning被引用 6

一句话总结

本研究探讨了数据增强技术在计算机视觉数据集中缓解偏差的影响，重点关注UTKFace数据集中性别、年龄和种族分布不均的问题。评估了欠采样、几何变换、变分自编码器（VAEs）和生成对抗网络（GANs）等方法，发现基于StarGAN的数据增强在UTKFace测试集上取得了最佳性能（准确率达91.75%），实现了各类别间一致的准确率，并在外部数据集上表现出更强的泛化能力。

ABSTRACT

A biased dataset is a dataset that generally has attributes with an uneven class distribution. These biases have the tendency to propagate to the models that train on them, often leading to a poor performance in the minority class. In this project, we will explore the extent to which various data augmentation methods alleviate intrinsic biases within the dataset. We will apply several augmentation techniques on a sample of the UTKFace dataset, such as undersampling, geometric transformations, variational autoencoders (VAEs), and generative adversarial networks (GANs). We then trained a classifier for each of the augmented datasets and evaluated their performance on the native test set and on external facial recognition datasets. We have also compared their performance to the state-of-the-art attribute classifier trained on the FairFace dataset. Through experimentation, we were able to find that training the model on StarGAN-generated images led to the best overall performance. We also found that training on geometrically transformed images lead to a similar performance with a much quicker training time. Additionally, the best performing models also exhibit a uniform performance across the classes within each attribute. This signifies that the model was also able to mitigate the biases present in the baseline model that was trained on the original training set. Finally, we were able to show that our model has a better overall performance and consistency on age and ethnicity classification on multiple datasets when compared with the FairFace model. Our final model has an accuracy on the UTKFace test set of 91.75%, 91.30%, and 87.20% for the gender, age, and ethnicity attribute respectively, with a standard deviation of less than 0.1 between the accuracies of the classes of each attribute.

研究动机与目标

为解决因UTKFace等流行数据集中类别分布不均导致的计算机视觉数据集偏差问题。
评估数据增强技术是否能减少模型偏差并提升少数类别的性能。
比较欠采样、几何变换、VAEs和GANs在缓解偏差方面的有效性。
评估模型在外部人脸识别数据集（LFWA+和CelebA）上的泛化能力。
与最先进的FairFace属性分类器进行性能基准对比。

提出的方法

应用四种数据增强技术：欠采样、几何变换、变分自编码器（VAEs）和生成对抗网络（GANs），包括StarGAN。
所有分类器均采用ResNet-18架构，以确保在不同增强数据集上的评估一致性。
在增强后的UTKFace样本上训练模型，并在原始UTKFace测试集和外部数据集上进行评估。
采用标准指标，包括准确率、各类别准确率的标准差以及跨数据集泛化性能。
与在FairFace数据集上训练的最先进模型进行对比，以评估相对性能。
使用PyTorch进行实现，并通过各类别准确率的一致性来评估模型鲁棒性。

实验结果

研究问题

RQ1RQ1：不同数据增强技术对模型在原生UTKFace测试集上的性能有何影响？
RQ2RQ2：在增强数据上训练的模型在外部人脸识别数据集（如LFWA+和CelebA）上的泛化能力如何？
RQ3RQ3：表现最佳的模型与最先进的FairFace模型在准确率和偏差缓解方面相比如何？

主要发现

在StarGAN生成的图像上进行训练，在UTKFace测试集上对性别分类的总体准确率达到最高，为91.75%，年龄分类为91.30%，种族分类为87.20%。
在StarGAN生成数据上训练的模型，各类别准确率的标准差均低于0.1，表明性能均匀且有效缓解了偏差。
几何变换的性能与StarGAN相当，但训练时间显著更短，因此是一种实用的替代方案。
表现最佳的模型在LFWA+和CelebA上的跨数据集泛化能力优于FairFace模型。
研究结果证实，通过GAN进行生成式增强比传统方法（如欠采样或VAEs）更有效，能更显著地减少偏差并提升公平性。
结果表明，数据增强可有效降低面部属性分类中少数类别之间的性能差异。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。