QUICK REVIEW

[论文解读] Inferring Gender from Names on the Web: A Comparative Evaluation of Gender Detection Methods

Fariba Karimi, Claudia Wagner|arXiv (Cornell University)|Mar 14, 2016

Authorship Attribution and Profiling参考文献 8被引用 85

一句话总结

本文在人工标注的科学家数据集上评估并比较了基于姓名与基于图像的性别检测方法，提出一种结合姓名推理（如 Genderize）与面部识别（Face++）的混合方法，以提升准确率并减少国家特定偏差。该混合方法达到92%的准确率，显著优于单一方法，尤其在中文和韩文等代表性不足国家中表现更优。

ABSTRACT

Computational social scientists often harness the Web as a "societal observatory" where data about human social behavior is collected. This data enables novel investigations of psychological, anthropological and sociological research questions. However, in the absence of demographic information, such as gender, many relevant research questions cannot be addressed. To tackle this problem, researchers often rely on automated methods to infer gender from name information provided on the web. However, little is known about the accuracy of existing gender-detection methods and how biased they are against certain sub-populations. In this paper, we address this question by systematically comparing several gender detection methods on a random sample of scientists for whom we know their full name, their gender and the country of their workplace. We further suggest a novel method that employs web-based image retrieval and gender recognition in facial images in order to augment name-based approaches. Our findings show that the performance of name-based gender detection approaches can be biased towards countries of origin and such biases can be reduced by combining name-based an image-based gender detection methods.

研究动机与目标

评估现有基于姓名的性别检测方法在不同国籍背景下的准确率与偏差。
探究基于图像的面部分析性别识别是否能提升检测性能，特别是对代表性不足的人群。
开发并评估一种结合基于姓名与基于图像检测的混合方法，以减少国家特定的性能差异。
为计算社会科学领域研究人员提供性别推断工具可靠性的基准参考。
揭示基于姓名的方法在新兴国家中的局限性，并倡导采用多模态方法。

提出的方法

使用人工整理的数据集，包含1,416位科学家，其性别、全名及居住国信息均来自学术简历与机构官网并经核实。
评估五种基于姓名的方法：SSA、IPUMS、Sexmachine、Genderize，以及作为基线的基于图像的方法Face++。
提出两种混合方法：Mixed1（顺序处理：先使用Genderize，再对未分类姓名使用Face++）与Mixed2（基于置信度分数的加权平均）。
通过网络图像检索获取科学家的面部图像，利用搜索引擎收集后，再使用Face++进行性别预测。
采用精确率、召回率、F1值与准确率评估性能，并按国籍进行分层分析。
在Mixed2中采用置信度加权融合策略，以更有效地处理模糊姓名，优于二元决策规则。

实验结果

研究问题

RQ1传统基于姓名的性别检测方法在不同国籍背景下的准确率如何？
RQ2基于图像的面部识别在多大程度上能提升性别检测准确率，特别是对代表性不足的民族？
RQ3结合基于姓名与基于图像的方法能否减少性别推断中的国家特定偏差？
RQ4哪种组合策略——顺序处理或置信度加权融合——能获得更优的性能与鲁棒性？
RQ5精确率、召回率、F1值与准确率等性能指标在性别与民族子群体间如何变化？

主要发现

混合方法（Mixed1）达到92%的整体准确率，优于所有单一方法至少8个百分点。
对于中国，表现最好的基于姓名方法（Sexmachine）准确率为67%，基于图像的Face++为65%，而Mixed1提升至50%——虽仍偏低，但优于大多数基于姓名的工具。
在韩国，基于姓名的方法完全失效（最低准确率4%），而基于图像的Face++达到74%准确率，Mixed1则达到37%，显著优于仅使用姓名的方法。
对于英国、德国与意大利等国，基于姓名的方法表现良好（准确率90%以上），但Mixed1仍使性能提升2至4个百分点。
Mixed2方法在男性与女性类别中均达到93%的F1值，表明在性别间表现均衡。
研究发现，基于姓名的方法表现出强烈的国家特定偏差，准确率在非西方国家，特别是巴西与印度等新兴经济体中显著下降。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。