QUICK REVIEW

[論文レビュー] Investigating Affective Use and Emotional Well-being on ChatGPT

Jason Phang, Matthias Lampe|ArXiv.org|Apr 4, 2025

Digital Mental Health Interventions被引用数 7

ひとこと要約

Two parallel studies—on-platform analysis of 3M conversations and an IRB-approved RCT with ~1,000 participants—examine how affective use of ChatGPT relates to emotional well-being, finding high usage links to dependence and nuanced voice effects.

ABSTRACT

As AI chatbots see increased adoption and integration into everyday life, questions have been raised about the potential impact of human-like or anthropomorphic AI on users. In this work, we investigate the extent to which interactions with ChatGPT (with a focus on Advanced Voice Mode) may impact users' emotional well-being, behaviors and experiences through two parallel studies. To study the affective use of AI chatbots, we perform large-scale automated analysis of ChatGPT platform usage in a privacy-preserving manner, analyzing over 3 million conversations for affective cues and surveying over 4,000 users on their perceptions of ChatGPT. To investigate whether there is a relationship between model usage and emotional well-being, we conduct an Institutional Review Board (IRB)-approved randomized controlled trial (RCT) on close to 1,000 participants over 28 days, examining changes in their emotional well-being as they interact with ChatGPT under different experimental settings. In both on-platform data analysis and the RCT, we observe that very high usage correlates with increased self-reported indicators of dependence. From our RCT, we find that the impact of voice-based interactions on emotional well-being to be highly nuanced, and influenced by factors such as the user's initial emotional state and total usage duration. Overall, our analysis reveals that a small number of users are responsible for a disproportionate share of the most affective cues.

研究の動機と目的

Assess how interactions with ChatGPT influence four psychosocial outcomes: loneliness, socialization, emotional dependence, and problematic use.
Analyze large-scale on-platform conversations to detect affective cues using automated classifiers while preserving user privacy.
Conduct an IRB-approved randomized controlled trial to study how model configurations affect user well-being over time.
Identify patterns showing that a small subset of users drive affective cues and that voice modalities have nuanced effects on well-being.

提案手法

Develop EmoClassifiersV1 (and EmoClassifiersV2) to detect affective cues with a two-tiered structure of top-level and sub-classifiers.
Perform on-platform analysis with power-user vs control-user cohorts on Advanced Voice Mode usage and conduct user surveys (over 4,000 respondents).
Run an IRB-approved randomized controlled trial with ~981 completers across nine conditions varying modality (engaging/neutral voice vs text) and daily tasks over 28 days.
Analyze 31,857 conversations from the RCT for relationships between user-model interactions and self-reported outcomes.
Treat classifiers as descriptive tools—privacy-preserving and correlated with survey responses rather than precise per-interaction labels.

実験結果

リサーチクエスチョン

RQ1Do engaging voice-based chatbot interactions differ from text or neutral voice in affecting loneliness, socialization, emotional dependence, and problematic use?
RQ2Do personal conversation prompts lead to different well-being outcomes than non-personal or open-ended prompts when using ChatGPT?
RQ3How do usage duration and initial emotional state modulate the impact of ChatGPT on well-being (from the RCT)?

主な発見

Very high usage (top decile) is associated with increased self-reported emotional dependence and lower perceived socialization.
A small subset of power users contributes disproportionately to affective cues in conversations.
In the RCT, voice model use tended to relate to better emotional well-being when controlling for usage duration, but longer usage and higher initial loneliness predicted worse outcomes.
Automated affective classifiers generally align with self-reported survey responses, and the on-platform and RCT analyses complement each other methodologically.
Most conversations are neutral or task-oriented, but a tail of users exhibits frequent affective cues in their chats.

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。