QUICK REVIEW

[论文解读] A Large-Scale Survey on the Usability of AI Programming Assistants: Successes and Challenges

Jenny T. Liang, Chenyang Yang|arXiv (Cornell University)|Mar 30, 2023

Software Engineering Research被引用 13

一句话总结

对410名开发者的规模调查，探讨他们为何以及如何使用AI编程助手、主要可用性挑战以及改进策略。

ABSTRACT

The software engineering community recently has witnessed widespread deployment of AI programming assistants, such as GitHub Copilot. However, in practice, developers do not accept AI programming assistants' initial suggestions at a high frequency. This leaves a number of open questions related to the usability of these tools. To understand developers' practices while using these tools and the important usability challenges they face, we administered a survey to a large population of developers and received responses from a diverse set of 410 developers. Through a mix of qualitative and quantitative analyses, we found that developers are most motivated to use AI programming assistants because they help developers reduce key-strokes, finish programming tasks quickly, and recall syntax, but resonate less with using them to help brainstorm potential solutions. We also found the most important reasons why developers do not use these tools are because these tools do not output code that addresses certain functional or non-functional requirements and because developers have trouble controlling the tool to generate the desired output. Our findings have implications for both creators and users of AI programming assistants, such as designing minimal cognitive effort interactions with these tools to reduce distractions for users while they are programming.

研究动机与目标

通过了解现实世界的实践和 AI 编程助手（如 Copilot）的可用性差距来推动本研究。
量化来自多样化开发者群体的采用情况、使用模式和感知利益。
识别阻碍采用和高效使用的关键可用性挑战。
提供降低认知负荷并改进对工具输出控制的设计启示。

提出的方法

从与 GitHub 相关的 AI 助手代码库招募参与者，并通过电子邮件邀请达到 410 名应答者。
使用 Qualtrics 进行15分钟调查，包含封闭式问题和开放式回答。
将定量频率分析与对回答的定性开放编码相结合。
采用最佳实践的问卷分析方法报告条目频率和重要性评分。
对开放式回答进行开放编码，以提取重复出现的可用性主题。

实验结果

研究问题

RQ1是什么驱动开发者使用 AI 编程助手，又是什么阻止他们使用？
RQ2在使用 AI 编程助手时遇到的最突出可用性问题有哪些？
RQ3开发者如何理解、评估和修改输出的代码，以及在何时放弃？
RQ4开发者采用哪些策略以从这些工具中获得有用输出，以及哪些反馈能改进它们？

主要发现

GitHub Copilot 用户报告其代码中位有 30.5% 是使用该工具编写。
最重要的动机是减少按键次数、更快完成任务，以及回忆语法。
不使用工具的主要原因包括输出不符合要求以及难以控制工具。
最主要的可用性问题是不了解哪些输入会影响输出、放弃输出的代码，以及难以控制模型。
用户能够成功生成重复性代码和简单逻辑代码；复杂算法往往难以得到有效协助。
参与者使用的策略包括提供清晰、明确的解释、添加上下文、遵循约定以及将提示拆解以获得更好输出。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。