QUICK REVIEW

[论文解读] Exploring Artist Gender Bias in Music Recommendation

Dougal Shakespeare, Lorenzo Porcaro|arXiv (Cornell University)|Sep 3, 2020

Recommender Systems and Techniques参考文献 39被引用 24

一句话总结

本研究使用协同过滤（CF）在两个Last.fm数据集LFM-1b和LFM-360k上调查音乐推荐系统中的性别偏见。研究发现，CF算法会放大用户偏好中已存在的性别偏见，其中基于模型的NMF表现出比基于记忆的UserKNNAvg更低的偏见差异，表明算法设计对推荐中性别表征具有显著影响。

ABSTRACT

Music Recommender Systems (mRS) are designed to give personalised and meaningful recommendations of items (i.e. songs, playlists or artists) to a user base, thereby reflecting and further complementing individual users' specific music preferences. Whilst accuracy metrics have been widely applied to evaluate recommendations in mRS literature, evaluating a user's item utility from other impact-oriented perspectives, including their potential for discrimination, is still a novel evaluation practice in the music domain. In this work, we center our attention on a specific phenomenon for which we want to estimate if mRS may exacerbate its impact: gender bias. Our work presents an exploratory study, analyzing the extent to which commonly deployed state of the art Collaborative Filtering(CF) algorithms may act to further increase or decrease artist gender bias. To assess group biases introduced by CF, we deploy a recently proposed metric of bias disparity on two listening event datasets: the LFM-1b dataset, and the earlier constructed Celma's dataset. Our work traces the causes of disparity to variations in input gender distributions and user-item preferences, highlighting the effect such configurations can have on user's gender bias after recommendation generation.

研究动机与目标

调查音乐推荐系统（mRS）中的协同过滤（CF）算法是否放大用户偏好中已存在的性别偏见。
利用两个大规模Last.fm收听事件数据集（LFM-1b和LFM-360k）评估mRS中的偏见差异。
比较基于记忆的（UserKNNAvg）和基于模型的（NMF）不同CF算法对性别偏见传播的影响。
探讨极端用户偏好（例如对女性艺术家的强烈偏好）是否导致推荐中偏见的放大。
强调算法偏见在音乐推荐中的社会技术影响，特别是对女性艺术家代表性不足的影响。

提出的方法

应用一种近期提出的偏见差异度量指标，以量化CF算法在推荐中放大或减少性别偏见的程度。
使用两个公开可用的Last.fm数据集：LFM-1b（10亿次交互）和LFM-360k（360,000名用户），以在不同用户偏好分布下评估偏见。
将艺术家和用户分类为二元性别类别（男性/女性），以衡量推荐结果中的基于性别的偏见。
开展两次实验：一次为平衡用户偏好，另一次为极端偏好女性艺术家，以测试偏见放大效应。
在相同条件下评估基于记忆的（UserKNNAvg）和基于模型的（NMF）协同过滤算法。
将偏见差异度量为输入用户偏好与输出推荐之间推荐偏见的差异，以评估算法公平性。

实验结果

研究问题

RQ1音乐推荐系统中的协同过滤算法在多大程度上放大了用户偏好中已存在的性别偏见？
RQ2CF算法的选择（基于记忆 vs. 基于模型）如何影响推荐中性别偏见的传播？
RQ3对女性艺术家的极端用户偏好是否导致推荐中偏见差异的增加？如果是，这种差异在不同算法间如何分布？
RQ4LFM-1b和LFM-360k数据集在偏见差异结果上存在哪些差异？这对数据规模和分布意味着什么？
RQ5能否设计出减少偏见放大的推荐系统？哪种算法方法在最小化性别差异方面最具潜力？

主要发现

协同过滤算法会放大音乐推荐中已存在的性别偏见，且在LFM-1b和LFM-360k数据集中，偏见差异显著增加。
在实验1（平衡偏好）中，NMF表现出最低的绝对偏见差异增幅，而UserKNNAvg放大了最多的偏见，尤其在LFM-1b中更为明显。
在实验2（对女性艺术家的极端偏好）中，女性艺术家的偏见差异变为正值，而男性艺术家则为负值，表明偏见传播并非单向的。
NMF在覆盖率和推荐多样性方面优于UserKNNAvg，表明其推荐中性别集中度更低，多样性更好。
LFM-1b数据集表现出比LFM-360k更强的偏见差异放大效应，表明大规模数据可能表现出更强的偏见传播效应。
未发现推荐过程本身产生新的偏见；相反，算法主要放大了输入数据中已存在的偏见。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。