Skip to main content
QUICK REVIEW

[論文レビュー] AI and My Values: User Perceptions of LLMs' Ability to Extract, Embody, and Explain Human Values from Casual Conversations

Bhada Yun, Renn Su|arXiv (Cornell University)|Jan 30, 2026
AI in Service Interactions被引用数 0
ひとこと要約

tldr: This paper introduces VAPT, a probe-based toolkit to study how LLMs extract, embody, and explain users’ values from casual conversations, via a month-long study with 20 participants and in-depth interviews.

ABSTRACT

Does AI understand human values? While this remains an open philosophical question, we take a pragmatic stance by introducing VAPT, the Value-Alignment Perception Toolkit, for studying how LLMs reflect people's values and how people judge those reflections. 20 participants texted a human-like chatbot over a month, then completed a 2-hour interview with our toolkit evaluating AI's ability to extract (pull details regarding), embody (make decisions guided by), and explain (provide proof of) human values. 13 participants left our study convinced that AI can understand human values. Participants found the experience insightful for self-reflection and found themselves getting persuaded by the AI's reasoning. Thus, we warn about "weaponized empathy": a potentially dangerous design pattern that may arise in value-aligned, yet welfare-misaligned AI. VAPT offers concrete artifacts and design implications to evaluate and responsibly build value-aligned conversational agents with transparency, consent, and safeguards as AI grows more capable and human-like into the future.

研究の動機と目的

  • Motivate and operationalize the study of AI value alignment from a user perspective using a practical toolkit (VAPT).
  • Assess how AI can extract, embody, and explain individuals’ values from casual conversations.
  • Examine user perceptions of AI-derived value models against self-reported values.
  • Identify design implications for transparent, privacy-preserving, value-aligned conversational agents.

提案手法

  • Introduce VAPT as a three-probe methodology to evaluate perceived value alignment in AI: extraction, embodiment, and explanation.
  • Collect longitudinal casual chat data (approximately one month) from 20 participants and use a Schwartz PVQ-RR baseline to assess values.
  • Instantiate VAPT with a month-long chat with an AI, a PVQ-RR survey, and three interactive interview interfaces.
  • Visualize extraction with Topic-Context Graphs showing topics and values with evidence trails.
  • Evaluate embodiment by testing AI responses that speak as the user would, using different evidence bases.
  • Assess explanation by comparing the AI’s reasoning and evidence against the participant’s self-understanding.
Figure 1 . We study how people experience AI’s attempts to understand their values through three capabilities. Left: AI extracts values from chat conversations, visualized as a Topic-Context Graph showing what matters to users with evidence trails. Middle: AI embodies values by attempting to respond
Figure 1 . We study how people experience AI’s attempts to understand their values through three capabilities. Left: AI extracts values from chat conversations, visualized as a Topic-Context Graph showing what matters to users with evidence trails. Middle: AI embodies values by attempting to respond

実験結果

リサーチクエスチョン

  • RQ1How well can an AI extract, embody, and explain a person’s values from casual conversations?
  • RQ2How do participants perceive the AI’s value inferences relative to their self-reported values?
  • RQ3What biases or failure modes emerge in extraction, embodiment, and explanation of values by AI?
  • RQ4What design implications arise for privacy, transparency, and user autonomy in value-aligned conversational agents?

主な発見

  • 13 of 20 participants were ultimately convinced that AI can understand human values.
  • PVQ-RR-based value inferences by AI showed moderate alignment with self-reports at the aggregate level, with systematic biases observed (e.g., over-estimation of self-direction, under-estimation of tradition).
  • Embodiment was better received when the AI matched how strongly a person would express a view, not just what they would say.
  • Users praised the AI’s ability to surface new connections via value graphs but criticized overfitting, cultural nuance gaps, or archetype defaults.
  • Risks identified include extraction of irrelevant topics, overconfident embodiment, and automation bias in explanations.
  • The study proposes design implications focused on privacy, safeguards, and friction against misleading empathy to prevent weaponized empathy.
Figure 2 . (Topic-Context Graph from Stage 1) A sample from two anonymous participants who shared various things with Day, their chatbot, over the course of several weeks. Colored nodes represent topics extracted from chat histories, positioned near their associated life contexts (People, Lifestyle,
Figure 2 . (Topic-Context Graph from Stage 1) A sample from two anonymous participants who shared various things with Day, their chatbot, over the course of several weeks. Colored nodes represent topics extracted from chat histories, positioned near their associated life contexts (People, Lifestyle,

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。