[論文レビュー] AI and My Values: User Perceptions of LLMs' Ability to Extract, Embody, and Explain Human Values from Casual Conversations
tldr: This paper introduces VAPT, a probe-based toolkit to study how LLMs extract, embody, and explain users’ values from casual conversations, via a month-long study with 20 participants and in-depth interviews.
Does AI understand human values? While this remains an open philosophical question, we take a pragmatic stance by introducing VAPT, the Value-Alignment Perception Toolkit, for studying how LLMs reflect people's values and how people judge those reflections. 20 participants texted a human-like chatbot over a month, then completed a 2-hour interview with our toolkit evaluating AI's ability to extract (pull details regarding), embody (make decisions guided by), and explain (provide proof of) human values. 13 participants left our study convinced that AI can understand human values. Participants found the experience insightful for self-reflection and found themselves getting persuaded by the AI's reasoning. Thus, we warn about "weaponized empathy": a potentially dangerous design pattern that may arise in value-aligned, yet welfare-misaligned AI. VAPT offers concrete artifacts and design implications to evaluate and responsibly build value-aligned conversational agents with transparency, consent, and safeguards as AI grows more capable and human-like into the future.
研究の動機と目的
- Motivate and operationalize the study of AI value alignment from a user perspective using a practical toolkit (VAPT).
- Assess how AI can extract, embody, and explain individuals’ values from casual conversations.
- Examine user perceptions of AI-derived value models against self-reported values.
- Identify design implications for transparent, privacy-preserving, value-aligned conversational agents.
提案手法
- Introduce VAPT as a three-probe methodology to evaluate perceived value alignment in AI: extraction, embodiment, and explanation.
- Collect longitudinal casual chat data (approximately one month) from 20 participants and use a Schwartz PVQ-RR baseline to assess values.
- Instantiate VAPT with a month-long chat with an AI, a PVQ-RR survey, and three interactive interview interfaces.
- Visualize extraction with Topic-Context Graphs showing topics and values with evidence trails.
- Evaluate embodiment by testing AI responses that speak as the user would, using different evidence bases.
- Assess explanation by comparing the AI’s reasoning and evidence against the participant’s self-understanding.

実験結果
リサーチクエスチョン
- RQ1How well can an AI extract, embody, and explain a person’s values from casual conversations?
- RQ2How do participants perceive the AI’s value inferences relative to their self-reported values?
- RQ3What biases or failure modes emerge in extraction, embodiment, and explanation of values by AI?
- RQ4What design implications arise for privacy, transparency, and user autonomy in value-aligned conversational agents?
主な発見
- 13 of 20 participants were ultimately convinced that AI can understand human values.
- PVQ-RR-based value inferences by AI showed moderate alignment with self-reports at the aggregate level, with systematic biases observed (e.g., over-estimation of self-direction, under-estimation of tradition).
- Embodiment was better received when the AI matched how strongly a person would express a view, not just what they would say.
- Users praised the AI’s ability to surface new connections via value graphs but criticized overfitting, cultural nuance gaps, or archetype defaults.
- Risks identified include extraction of irrelevant topics, overconfident embodiment, and automation bias in explanations.
- The study proposes design implications focused on privacy, safeguards, and friction against misleading empathy to prevent weaponized empathy.

より良い研究を、今すぐ始めましょう
論文設計から論文執筆まで、研究時間を劇的に削減しましょう。
クレジットカード登録不要
このレビューはAIが作成し、人間の編集者が確認しました。