QUICK REVIEW

[论文解读] From Prompts to Worlds: How Users Iterate, Explore, and Make Sense of AI-Generated 3D Environments

Aung Pyae|arXiv (Cornell University)|Jan 24, 2026

Social Robot Interaction and HRI被引用 0

一句话总结

本研究实证探讨用户如何与商业文本到3D平台互动，揭示语言到空间的鸿沟、分段在场感以及迭代障碍，这些因素塑造了AI生成3D环境中的意义建构。

ABSTRACT

Text-to-3D generative AI systems create navigable environments from natural language prompts, but unlike text-to-image generation, evaluation requires embodied exploration of spatial coherence, scale, and navigability. We present the first empirical study of a commercial text-to-3D platform, combining think-aloud protocols, behavioral observation, and validated measures of usability, presence, and engagement. We report three findings. First, asymmetric expressibility: users readily convey semantic intent (themes, atmosphere) but struggle to specify spatial structure (layout, scale), reflecting a language-to-space limitation rather than a skill deficit. Second, episodic presence: immersion arises when expectations align with outputs but does not accumulate into sustained place illusion. Third, structural iteration breakdowns: refinement fails due to interaction barriers - poor discoverability, opaque feedback, and high temporal costs - rather than user limitations. Together, these dynamics form a reinforcing cycle in which spatial mismatches persist, producing episodic presence and ongoing sensemaking. We reframe text-to-3D interaction as negotiated meaning-making rather than linear prompting, and argue that effective systems require hybrid input modalities, transparent feedback, and low-cost iteration.

研究动机与目标

理解用户如何将自然语言提示转化为可导航的3D空间的动因与机理
检验用户通过具身任务对AI生成的3D环境进行迭代、探索与意义建构的过程
识别影响文本到3D系统的可用性、在场感和参与度的认知与交互障碍
提出设计含混合输入模态、透明反馈与低成本迭代的设计含义，以改善用户体验

提出的方法

将思维声速记录法与行为观察相结合，在与商业文本到3D平台互动过程中进行
使用经过验证的可用性、在场感与参与度量表评估用户体验
分析语义意图表达与空间结构规范之间的关系，以识别语言到空间的局限性
刻画在场感的情节化出现及其与输出期望对齐的关系
识别细化过程中的 breakdown 点及其潜在原因，如可发现性和反馈不透明性

实验结果

研究问题

RQ1在使用文本到3D提示时，用户如何表达语义意图和空间结构？
RQ2AI生成的3D环境中的在场感与沉浸感的模式是什么，它们如何与期望与输出的一致性相关？
RQ3哪些交互障碍阻碍文本到3D工具的系统化细化与迭代？
RQ4哪些设计变更可以缓解语言到空间的差距，支持更低成本、更加透明的迭代？
RQ5在意义建构方面，文本到3D系统应如何进行框架，而非线性提示？

主要发现

用户能够传达语义主题与氛围，但在指定空间布局和尺度方面存在困难
当输出与期望对齐时，沉浸感（ episodic presence）会出现，但并不累积形成持续的 place illusion
细化过程的失败源于可发现性差、反馈不透明以及高时间成本等交互障碍
出现一种强化的循环：时空不匹配持续存在，导致 episodic presence 和持续的意义建构
本研究主张将文本到3D 交互视为协商的意义建构，并提出混合输入、透明反馈和低成本迭代的设计思路

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。