[论文解读] Using Text-to-Image Generation for Architectural Design Ideation
该研究调查文本到图像生成器(Midjourney、Stable Diffusion、DALL-E)在早期建筑概念阶段如何支持创造力,通过面向建筑学生的实验室研究显示潜在好处与挑战。
The recent progress of text-to-image generation has been recognized in architectural design. Our study is the first to investigate the potential of text-to-image generators in supporting creativity during the early stages of the architectural design process. We conducted a laboratory study with 17 architecture students, who developed a concept for a culture center using three popular text-to-image generators: Midjourney, Stable Diffusion, and DALL-E. Through standardized questionnaires and group interviews, we found that image generation could be a meaningful part of the design process when design constraints are carefully considered. Generative tools support serendipitous discovery of ideas and an imaginative mindset, enriching the design process. We identified several challenges of image generators and provided considerations for software development and educators to support creativity and emphasize designers' imaginative mindset. By understanding the limitations and potential of text-to-image generators, architects and designers can leverage this technology in their design process and education, facilitating innovation and effective communication of concepts.
研究动机与目标
- Assess whether out-of-the-box text-to-image generators can support creativity during the fuzzy front end of architectural design.
- Examine how generators influence ideation, idea discovery, and imaginative thinking in concept development.
- Identify challenges and provide practical considerations for software developers and educators to foster creativity.
提出的方法
- Three-session laboratory study with 17 architecture students designing a culture center concept.
- Participants used three generators (Midjourney, Stable Diffusion, DALL-E) with basic prompts and no advanced features.
- Data collected via Creativity Support Index (CSI) surveys and semi-structured group interviews.
- Prompts and sequences were analyzed to understand prompt strategy and language.
- Qualitative analysis of participant discussions to extract insights on creativity support and tool limitations.
实验结果
研究问题
- RQ1How can text-to-image generators support creativity and ideation in early-stage architectural design?
- RQ2How effective are out-of-the-box text-to-image generators for architectural design, and what future considerations should developers adopt?
- RQ3What are the typical challenges of using text-to-image generators and prompting for novice users?
主要发现
| Subfactor | Mean Factor Counts (StDev) | Mean Factor Score (StDev) | Mean Weighted Factor Score (StDev) |
|---|---|---|---|
| Collaboration | 0.47 (0.87) | 6.88 (5.24) | 3.24 (4.58) |
| Enjoyment | 2.53 (1.18) | 17.1 (1.96) | 43.30 (2.32) |
| Exploration | 4.41 (0.62) | 14.1 (2.87) | 62.28 (1.77) |
| Expressiveness | 3.00 (1.27) | 14.9 (3.21) | 44.82 (4.09) |
| Immersion | 2.12 (1.17) | 11.5 (3.89) | 24.29 (4.54) |
| ResultsWorthEffort | 2.47 (1.55) | 15.7 (2.42) | 38.80 (3.74) |
- Image generation can be a meaningful part of early design when design constraints and imaginative ideation are considered.
- Generative tools support serendipitous idea discovery and an imaginative mindset, enriching the design process.
- CSI results show no significant differences in creativity support across the three tools, but subfactors vary in importance (Exploration and Enjoyment are prominent).
- Prompts tended to be descriptive and aligned with the design brief (e.g., floorplan, facade), with sequences averaging around 8 prompts per idea and sequences up to 24 prompts.
- Floorplans were challenging to generate and often produced non-traditional, color-rich or 3D representations rather than standard drawings; facade materials were harder to generate than interior views.
- Participants valued features like potential for editing/refinement, constraint incorporation, and integration with traditional CAD-like workflows
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。