QUICK REVIEW

[논문 리뷰] How Novices Use LLM-Based Code Generators to Solve CS1 Coding Tasks in a Self-Paced Learning Environment

Majeed Kazemitabaar, Xinying Hou|arXiv (Cornell University)|2023. 09. 25.

Software Engineering Research인용 수 9

한 줄 요약

이 연구는 33명의 초보 Python 학습자들이 self-paced 환경에서 45개의 CS1 과제를 수행하는 동안 OpenAI Codex 기반 AI 코드 생성기를 어떻게 사용했는지 분석하여 사용 패턴, 프롬프트 스타일, AI가 생성한 코드의 특성, 그리고 네 가지 코딩 접근 방식을 식별합니다.

ABSTRACT

As Large Language Models (LLMs) gain in popularity, it is important to understand how novice programmers use them. We present a thematic analysis of 33 learners, aged 10-17, independently learning Python through 45 code-authoring tasks using Codex, an LLM-based code generator. We explore several questions related to how learners used these code generators and provide an analysis of the properties of the written prompts and the generated code. Specifically, we explore (A) the context in which learners use Codex, (B) what learners are asking from Codex, (C) properties of their prompts in terms of relation to task description, language, and clarity, and prompt crafting patterns, (D) the correctness, complexity, and accuracy of the AI-generated code, and (E) how learners utilize AI-generated code in terms of placement, verification, and manual modifications. Furthermore, our analysis reveals four distinct coding approaches when writing code with an AI code generator: AI Single Prompt, where learners prompted Codex once to generate the entire solution to a task; AI Step-by-Step, where learners divided the problem into parts and used Codex to generate each part; Hybrid, where learners wrote some of the code themselves and used Codex to generate others; and Manual coding, where learners wrote the code themselves. The AI Single Prompt approach resulted in the highest correctness scores on code-authoring tasks, but the lowest correctness scores on subsequent code-modification tasks during training. Our results provide initial insight into how novice learners use AI code generators and the challenges and opportunities associated with integrating them into self-paced learning environments. We conclude with various signs of over-reliance and self-regulation, as well as opportunities for curriculum and tool development.

연구 동기 및 목표

자가 속도 환경에서 CS1 과제 중에 초보 학습자들이 언제 그리고 왜 AI 코드 생성기를 사용하는지 이해한다.
Codex와 상호작용하기 위해 초보자들이 제작하는 프롬프트를 특징지하고 이 프롬프트가 과제 설명과 어떻게 관련되는지 파악한다.
AI가 생성한 코드의 속성(정확성, 복잡도, 커리큘럼과의 정렬)을 분석하고 학습자들이 그것을 어떻게 통합하는지 알아본다.
AI 생성과 함께 사용되는 일반적인 코딩 접근 방식과 그것이 학습 결과에 미치는 영향을 식별한다.

제안 방법

저자들은 Coding Steps에서 45개의 Python 코딩 과제 동안 Codex를 사용한 33명의 초보 학습자(연령 10-17)의 로그 데이터에 대해 주제 분석을 수행한다.
데이터 소스에는 타임스탬프가 찍힌 로그: 코드 편집, 콘솔 실행, AI 생성 프롬프트 및 출력물, 과제 제출이 포함된다.
맞춤형 로그 분석 인터페이스가 학생의 행동을 수직 시간 순서로 시각화하고 재현하는 데 도움을 준다.
연구자들은 연역적 및 귀납적 주제 분석을 적용하여 Codex 사용을 맥락, 프롬프트 속성, AI가 생성한 코드 속성, 사용 패턴으로 분류한다.
코드북 적용에 관한 연구자 간 신뢰도는 초기 코딩 라운드에서 0.87(alpha)을 달성했다.

Figure 1. An example of using AI-generated code as an example to fix syntax error with writing loops.

실험 결과

연구 질문

RQ1RQ1: 자가 속도 환경에서 CS1 코딩 과제를 학습할 때 초보자들이 LLM 기반 코드 생성기를 어떻게 사용하고 상호작용하는가? (Codex가 언제 사용되는지, Codex에게 무엇을 요청하는지, 프롬프트 속성, AI가 생성한 코드 속성, 그리고 코드가 어떻게 사용/검증되는지 측면에서)
RQ2RQ2: 초보자들이 AI 코드 생성기를 사용할 때 어떤 코딩 접근법을 사용하는가, 그리고 이러한 접근법이 학습 결과에 어떤 영향을 미치는가?

주요 결과

네 가지 코딩 접근 방식이 나타났다: AI 단일 프롬프트, AI 단계별, 하이브리드, 그리고 수동 코딩.
AI 단일 프롬프트가 코드 작성 과제에서 가장 높은 정확도를 보였지만 이후 코드 수정 과제에서는 가장 낮은 정확도를 보였다.
AI가 생성한 코드의 81%는 식별 가능한 문제가 없었고, 19%는 과제 요구사항을 따르지 않거나 기존 코드를 재생성하는 등의 문제가 있었다.
학습자들은 Codex에게 전체 솔루션, 하위 목표를 생성하거나 기존 코드를 수정하도록 프롬프트하는 경향이 있었고, 프롬프트는 자주 과제 설명을 복사하거나 재구성하는 경우가 많았다.
프롬프트 패턴에는 문장별로 과제를 분해하는 방식과 생성을 안내하기 위한 반복적 재서술이 포함됐다.
과도한 의존성과 자기조절의 증거는 효과적 AI 지원 학습을 촉진하기 위한 커리큘럼 및 도구 설계의 필요성을 시사한다.

Figure 2. An example of keeping the original code instead of replacing it with AI-generated code ( $P_{12}$ ).

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.