Skip to main content
QUICK REVIEW

[论文解读] Using word embeddings to analyse audience effects and individual differences in parenting Subreddits

Melody Sepahpour‐Fard, Michael Quayle|arXiv (Cornell University)|Jan 1, 2023
Social Media and Politics被引用 1
一句话总结

本研究利用用户增强的词嵌入技术,在Reddit育儿社区中探究性别与受众语境(单性别与混合性别子版块)如何影响语言使用。研究发现,母亲与父亲在r/Parenting中在话题多样性上趋于一致,但在单性别空间中则出现分化——母亲聚焦于健康、睡眠与喂养,而父亲则更关注外貌与教育。高自我监控者在混合性别环境中更能适应社区规范。

ABSTRACT

Human beings adapt their language to the audience they interact with. To study the impact of audience and gender in a natural setting, we choose a domain where gender plays a particularly salient role: parenting. We collect posts from the three popular parenting Subreddits (i.e., topical communities on Reddit) r/Daddit, r/Mommit, and r/Parenting. These three Subreddits gather different audiences, respectively, self-identifying as fathers and mothers (ostensibly single-gender), and parents (explicitly mixed-gender). By selecting a sample of users who have published on both a single-gender and a mixed-gender Subreddit, we are able to explore both audience and gender effects. We analyse posts with word embeddings by adding the username as a token in the corpus. This way, we are able to compare user-tokens to word-tokens and measure their similarity. We also investigate individual differences in this context by comparing users who exhibit significant changes in their behaviour (high self-monitors) with those who show less variation (low self-monitors). Results show that r/Parenting users generally discuss a great diversity of topics while fathers focus more on advising others on educational and family matters. Mothers in r/Mommit distinguish themselves from other groups by primarily discussing topics such as medical care, sleep and potty training, and food. Both mothers and fathers celebrate parenting events and describe or comment on the physical appearance of their children with a single-gender audience. In terms of individual differences, we find that, especially on r/Parenting, high self-monitors tend to conform more to the norms of the Subreddit by discussing more of the topics associated with the Subreddit. In conclusion, this study shows how mothers and fathers express different concerns and change their behaviour for different group-based audiences.

研究动机与目标

  • 探究受众构成(单性别与混合性别)如何影响在线育儿社区中的语言使用。
  • 考察性别身份(母亲、父亲、家长)在塑造话题选择与语言表达中的作用。
  • 分析个体在自我监控行为上的差异及其对不同语境下语言适应的影响。
  • 验证一种新型用户增强词嵌入方法在大规模研究身份表现与受众效应中的有效性。

提出的方法

  • 从三个育儿子版块收集Reddit帖子:r/Mommit(自认为的母亲)、r/Daddit(自认为的父亲)和r/Parenting(混合性别)。
  • 选取在单性别与混合性别子版块均有发帖的用户,以实现个体在不同受众语境下的对比分析。
  • 通过将用户名视为唯一标记,对文本语料进行增强,并使用带有负采样的跳字模型训练词嵌入。
  • 利用用户嵌入比较用户间的语言行为,并测量用户标记与词标记之间的相似性。
  • 使用LDA对嵌入空间中的主题进行聚类,并通过关键词分析与既有文献进行验证。
  • 根据个体在不同语境下语言一致性的高低,将用户分类为高或低自我监控者,并分析其话题分布差异。

实验结果

研究问题

  • RQ1母亲与父亲在单性别与混合性别在线育儿社区中的语言使用有何不同?
  • RQ2个体在多大程度上会根据受众构成调整其话题选择与语言风格?
  • RQ3自我监控能力的个体差异(高 vs. 低)如何影响其在不同受众语境下的语言适应?
  • RQ4在性别化子版块中,母亲与父亲最关注哪些话题,这些话题在混合性别环境中如何变化?

主要发现

  • 在r/Mommit中,母亲主要讨论医疗护理、睡眠、如厕训练与食物,表明其关注儿童健康与日常照护。
  • 在r/Daddit中,父亲更强调与儿童外貌相关的话题,并在单性别语境中比母亲更频繁地分享照片。
  • 在r/Parenting中,母亲与父亲在话题多样性上趋于一致,讨论更广泛的育儿议题,表明受众驱动下的语言使用趋同。
  • 在r/Parenting中,高自我监控者更强烈地与子版块的主导话题保持一致,表明其对社区规范的更高程度的遵从。
  • 将用户名作为标记加入词嵌入的方法,成功捕捉了用户层面的语言模式,并实现了用户与词语之间的有意义比较。
  • 尽管存在方法论局限,嵌入空间仍呈现出与既有研究一致的连贯主题聚类,验证了该方法的有效性。

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。