QUICK REVIEW

[논문 리뷰] A Large-Scale Survey on the Usability of AI Programming Assistants: Successes and Challenges

Jenny T. Liang, Chenyang Yang|arXiv (Cornell University)|2023. 03. 30.

Software Engineering Research인용 수 13

한 줄 요약

대규모 410명의 개발자를 대상으로 한 설문으로 AI 프로그래밍 어시스턴트의 사용 이유와 방식, 주요 사용성 문제점, 그리고 개선 전략을 조사한다.

ABSTRACT

The software engineering community recently has witnessed widespread deployment of AI programming assistants, such as GitHub Copilot. However, in practice, developers do not accept AI programming assistants' initial suggestions at a high frequency. This leaves a number of open questions related to the usability of these tools. To understand developers' practices while using these tools and the important usability challenges they face, we administered a survey to a large population of developers and received responses from a diverse set of 410 developers. Through a mix of qualitative and quantitative analyses, we found that developers are most motivated to use AI programming assistants because they help developers reduce key-strokes, finish programming tasks quickly, and recall syntax, but resonate less with using them to help brainstorm potential solutions. We also found the most important reasons why developers do not use these tools are because these tools do not output code that addresses certain functional or non-functional requirements and because developers have trouble controlling the tool to generate the desired output. Our findings have implications for both creators and users of AI programming assistants, such as designing minimal cognitive effort interactions with these tools to reduce distractions for users while they are programming.

연구 동기 및 목표

Copilot과 같은 AI 프로그래밍 어시스턴트의 실제 사용 관행과 사용성 격차를 이해하여 연구를 고취시킨다.
다양한 개발자들 사이에서 채택 현황, 사용 패턴 및 인지된 혜택을 정량화한다.
도입 및 생산적 사용을 저해하는 주요 사용성 문제를 식별한다.
인지 부하를 줄이고 도구 산출물에 대한 제어를 개선하기 위한 설계 시사점을 제시한다.

제안 방법

GitHub 관련 AI 어시스턴트 저장소에서 참가자를 모집하고 이메일로 410명의 응답자에게 초대한다.
닫힌형 질문과 주관식 응답을 포함한 15분 분량의 Qualtrics 설문조사를 실시한다.
응답의 정량적 빈도 분석과 질적 열린 코딩을 결합한다.
항목 빈도와 중요도 평점을 보고하기 위해 모범 설문 분석 방법을 사용한다.
주관식 응답에 대해 열린 코딩을 수행하여 반복적으로 나타나는 사용성 주제를 추출한다.

실험 결과

연구 질문

RQ1개발자들이 AI 프로그래밍 어시스턴트를 사용하게 만드는 동기는 무엇이며, 사용을 가로막는 요인은 무엇인가?
RQ2AI 프로그래밍 어시스턴트를 사용할 때 가장 두드러진 사용성 문제는 무엇인가?
RQ3개발자들은 산출된 코드를 어떻게 이해하고 평가하며 수정하고, 언제 포기하는가?
RQ4개발자들이 이러한 도구에서 도움이 되는 산출물을 얻기 위해 어떤 전략을 사용하며, 어떤 피드백이 이를 개선할 수 있는가?

주요 결과

GitHub Copilot 사용자는 도구로 작성된 코드의 중앙값이 30.5%라고 보고한다.
가장 중요한 동기는 키 입력 감소, 작업을 더 빨리 완료, 구문을 기억하는 것이다.
도구를 사용하지 않는 주된 이유로는 산출물이 요구사항을 충족하지 않음과 도구를 제어하기 어려움이 있다.
가장 큰 사용성 문제는 어떤 입력이 산출물에 영향을 주는지 모르는 것, 산출물 코드를 포기하는 것, 모델 제어의 어려움이다.
사용자들은 반복 코드와 단순 로직 코드를 성공적으로 생성하지만, 복잡한 알고리즘은 효과적으로 도움을 받기 어렵다.
참여자들은 더 나은 산출물을 얻기 위해 명확하고 구체적인 설명을 제공하고 맥락을 추가하며 규칙을 따르고 프롬프트를 분해하는 등의 전략을 사용한다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.