QUICK REVIEW

[論文レビュー] How Novices Use LLM-Based Code Generators to Solve CS1 Coding Tasks in a Self-Paced Learning Environment

Majeed Kazemitabaar, Xinying Hou|arXiv (Cornell University)|Sep 25, 2023

Software Engineering Research被引用数 9

ひとこと要約

この研究は、33人の初心者Python学習者が自己調整型環境で45件のCS1タスクを完了する際、OpenAI CodexベースのAIコード生成器をどのように使用したかを分析し、使用パターン・プロンプトスタイル・AI生成コードの性質・4つのコーディングアプローチを特定する。

ABSTRACT

As Large Language Models (LLMs) gain in popularity, it is important to understand how novice programmers use them. We present a thematic analysis of 33 learners, aged 10-17, independently learning Python through 45 code-authoring tasks using Codex, an LLM-based code generator. We explore several questions related to how learners used these code generators and provide an analysis of the properties of the written prompts and the generated code. Specifically, we explore (A) the context in which learners use Codex, (B) what learners are asking from Codex, (C) properties of their prompts in terms of relation to task description, language, and clarity, and prompt crafting patterns, (D) the correctness, complexity, and accuracy of the AI-generated code, and (E) how learners utilize AI-generated code in terms of placement, verification, and manual modifications. Furthermore, our analysis reveals four distinct coding approaches when writing code with an AI code generator: AI Single Prompt, where learners prompted Codex once to generate the entire solution to a task; AI Step-by-Step, where learners divided the problem into parts and used Codex to generate each part; Hybrid, where learners wrote some of the code themselves and used Codex to generate others; and Manual coding, where learners wrote the code themselves. The AI Single Prompt approach resulted in the highest correctness scores on code-authoring tasks, but the lowest correctness scores on subsequent code-modification tasks during training. Our results provide initial insight into how novice learners use AI code generators and the challenges and opportunities associated with integrating them into self-paced learning environments. We conclude with various signs of over-reliance and self-regulation, as well as opportunities for curriculum and tool development.

研究の動機と目的

初心者学習者がCS1タスクを自己調整環境でいつ、なぜAIコード生成器を使用するかを理解する。
Codexと対話するために初心者が作成するプロンプトを特徴付け、それらのプロンプトがタスク記述とどのように関連するかを明らかにする。
AI生成コードの性質（正確さ、複雑さ、カリキュラムへの適合性）と学習者の統合方法を分析する。
AI生成とともに使用される一般的なコーディングアプローチを特定し、それらが学習成果に与える影響を評価する。

提案手法

著者らは、Codexを使用して45件のPythonコーディングタスクを実施する33人の初心者学習者（年齢10-17歳）のログデータに基づくテーマ分析を実施した。
データソースには、時刻スタンプ付きログ（コード編集、コンソール実行、AI生成のプロンプトと出力、タスク提出）を含む。
縦型の時間系列における学生の行動を可視化・再現するためのカスタムログ分析インターフェースを使用。
研究者はCodex使用を文脈・プロンプト属性・AI生成コードの性質・使用パターンにコード化するため、演繹的・帰納的テーマ分析を適用した。
コードブック適用の研究者間信頼性は、初期コーディングラウンドで0.87（α）を達成。

Figure 1. An example of using AI-generated code as an example to fix syntax error with writing loops.

実験結果

リサーチクエスチョン

RQ1RQ1: 自己調整環境でCS1コーディングタスクを学ぶ際、初心者はLLMベースのコード生成器をどのように使用・相互作用するか？（Codexの使用時期、Codexに求める内容、プロンプト属性、AI生成コードの性質、コードの使用・検証方法の観点から）
RQ2RQ2: 初心者はAIコード生成器を使用する際にどのようなコーディングアプローチを採用し、これらのアプローチが学習成果にどのような影響を与えるか。

主な発見

4つのコーディングアプローチが出現: AI単一プロンプト、AI逐次推論、ハイブリッド、 Manualコーディング。
AI単一プロンプトは、コード作成タスクで最も正確さが高かったが、その後のコード修正タスクでは最も低かった。
AI生成コードの81%には識別可能な問題がなく、19%は課題要件に従わない、既存コードを再生成する等の問題を持っていた。
学習者はしばしばCodexに全体解決策、サブゴール、または既存コードの修正を生成させるプロンプトを使用し、タスク説明をコピーまたは言い換えてプロンプトに取り込むことが多かった。
プロンプトパターンには、タスクを文ごとに分解する手法や、生成を誘導する反復的な言い換えが含まれていた。
過度の依存と自己調整の証拠は、効果的なAI補助学習を促すカリキュラムとツール設計の必要性を示唆している。

Figure 2. An example of keeping the original code instead of replacing it with AI-generated code ( $P_{12}$ ).

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。