QUICK REVIEW

[논문 리뷰] What is it like to program with artificial intelligence?

Advait Sarkar, Andrew Gordon|arXiv (Cornell University)|2022. 08. 12.

Spreadsheets and End-User Computing인용 수 46

한 줄 요약

이 논문은 LLM-지원 프로그래밍이 전통적인 프로그래머 지원과 어떻게 다르고, 신뢰성 및 사용성에 대해 논의하며, LLM을 이용한 엔드-유저 스프레드시트 프로그래밍에 대해 보고합니다.

ABSTRACT

Large language models, such as OpenAI's codex and Deepmind's AlphaCode, can generate code to solve a variety of problems expressed in natural language. This technology has already been commercialised in at least one widely-used programming editor extension: GitHub Copilot. In this paper, we explore how programming with large language models (LLM-assisted programming) is similar to, and differs from, prior conceptualisations of programmer assistance. We draw upon publicly available experience reports of LLM-assisted programming, as well as prior usability and design studies. We find that while LLM-assisted programming shares some properties of compilation, pair programming, and programming via search and reuse, there are fundamental differences both in the technical possibilities as well as the practical experience. Thus, LLM-assisted programming ought to be viewed as a new way of programming with its own distinct properties and challenges. Finally, we draw upon observations from a user study in which non-expert end user programmers use LLM-assisted tools for solving data tasks in spreadsheets. We discuss the issues that might arise, and open research challenges, in applying large language models to end-user programming, particularly with users who have little or no programming expertise.

연구 동기 및 목표

LLM-지원 프로그래밍이 기존의 프로그래머 지원 패러다임과 맞물리거나 벗어나는지 평가합니다.
Copilot과 같은 실제 도구에서 코드 생성 LLM의 능력, 한계 및 신뢰성 문제를 요약합니다.
프로그래머의 사용성, 설계 연구 및 체험 보고를 검토하여 주요 사용성 이슈를 식별합니다.
LLM-지원 도구를 사용할 때 엔드-유저 프로그래밍 및 스프레드시트 작업에 대한 시사점을 논의합니다.

제안 방법

공개적으로 이용 가능한 LLM-지원 프로그래밍에 대한 체험 보고를 종합합니다.
AI-지원 도구와 전통적 방법을 비교하는 사용성 및 설계 연구를 활용합니다.
코드 생성 모델의 안전성, 신뢰성 및 보안 함의를 논의합니다.
LLM-지원 프로그래밍을 검색, 컴파일 및 페어 프로그래밍과 같은 은유와 비교합니다.
스프레드시트의 엔드-유저 프로그래밍 연구에서의 발견을 요약합니다.

실험 결과

연구 질문

RQ1LLM-지원 프로그래밍과 전통적 프로그래머 지원(예: 컴파일, 페어 프로그래밍, 검색 및 재사용) 사이의 핵심 유사점과 차이점은 무엇인가?
RQ2실무에서 코드 생성을 위해 LLM을 사용할 때 발생하는 사용성, 신뢰성 및 안전성 문제는 무엇인가?
RQ3비전통적 도메인(예: 스프레드시트)의 엔드-유저 프로그래머는 LLM-지원 프로그래밍을 어떻게 경험하는가?

주요 결과

LLM-지원 프로그래밍은 컴파일, 페어 프로그래밍, 검색을 통한 프로그래밍과 일부 특성을 공유하지만, 기술적 가능성과 사용자 경험에서 근본적인 차이가 있다.
코드 생성 모델은 전체 함수 본문과 테스트 케이스를 생성할 수 있지만, 학습 데이터에서 복사되거나 부정확하거나 범위를 벗어난 코드를 생산할 수 있다.
사용성 연구는 프롬프트 구성 및 디버깅이 주요 도전 과제이며, 작업 시간 감소의 가능성이 있지만 정확성에 대한 효과는 혼합적이라는 것을 보여준다.
엔드-유저 프로그래밍 연구는 의도 명세, 코드 정확성, 이해력의 이슈를 드러내며 비전문가를 위한 온보딩 및 도구 개선의 필요성을 시사한다.
GitHub Copilot과 같은 상용 도구는 시작점 및 API 발견에서 이점을 보여주지만, 효과는 프롬프트 품질과 생성 코드의 사용자 처리에 의존한다.
신뢰성 및 안전성 문제로는 취약성 확산, 과도한 의존성, AI 생성 코드의 코드 리뷰 및 테스트 필요성이 포함된다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.