QUICK REVIEW

[论文解读] Text-to-Image Generation: Perceptions and Realities

Jonas Oppenlaender, Aku Visuri|arXiv (Cornell University)|Mar 10, 2023

Virtual Reality Applications and Impacts被引用 15

一句话总结

本论文报告了对文本到图像生成的认知在不同群体中的调查，发现人们对风险有意识，但对个人威胁感知较低；先前接触会降低对未来重要性的预期。

ABSTRACT

Generative AI is an emerging technology that will have a profound impact on society and individuals. Only a decade ago, it was thought that creative work would be among the last to be automated - yet today, we see AI encroaching on creative domains. In this paper, we present the key findings of a survey study on people's perceptions of text-to-image generation. We touch on participants' technical understanding of the emerging technology, their ideas for potential application areas, as well as concerns, risks, and dangers of text-to-image generation to society and the individual. The study found that participants were aware of the risks and dangers associated with the technology, but only few participants considered the technology to be a risk to themselves. Additionally, those who had tried the technology rated its future importance lower than those who had not.

研究动机与目标

评估人们如何理解文本到图像生成并区分训练与推理。
识别公众所认为的潜在应用领域。
评估对文本到图像生成的担忧及感知的社会与个体风险。
探索该技术对参与者职业未来重要性的认知。
研究在AI生成图像中对披露与署名/版权问题的态度。

提出的方法

2022年秋季在“研究者之夜”举行的线上调查，共有35名参与者。
问卷共26道题，其中包含3道开放性题项。
对开放性回答采用原始编码的定性分析。
由于数据量适中且编码直观，无需评估不同评测者之间的一致性。
参与者年龄19–50岁（均值33.7），教育背景多样。
约34.3% 报告曾使用文本到图像生成；常用工具包括 DALL-E Mini/Craiyon、DALL-E 2、Dream/Wombo 和 Stable Diffusion。

实验结果

研究问题

RQ1公众对文本到图像生成的技术理解是什么，包括对训练和推理的区分？
RQ2人们设想的文本到图像生成的应用领域有哪些（艺术、媒体、教育等）？
RQ3对文本到图像生成的感知风险及社会影响有哪些（错误信息、失业、版权、多样性等）？
RQ4对参与者当前和未来职业实践来说，文本到图像生成有多重要？
RQ5AI生成的图像是否应当被如实披露，谁应当对AI生成作品的署名负责？

主要发现

大多数参与者无法清楚解释文本到图像生成的工作原理，且常将训练与推理混淆。
参与者设想的应用包括艺术作品、插画、头脑风暴、市场营销、设计、娱乐和教育；对非创意用途的强调较少。
大多数受访者认为该技术对其职业目前并非重要，但未来可能变得更重要。
以往尝试过文本到图像生成的人对未来重要性的评价低于未尝试者。
关注点集中在社会风险：通过深度伪造产生的错误信息、失业、版权模糊、可能削弱对人类创作者的欣赏，以及合成图像中的文化多样性偏见。
约一半参与者支持对艺术作品披露AI生成来源；很少有人认为不应给AI生成作品标注。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。