QUICK REVIEW

[論文レビュー] Enhancing Code Generation Performance of Smaller Models by Distilling the Reasoning Ability of LLMs

Zhihong Sun, Chen Lyu|arXiv (Cornell University)|Mar 20, 2024

Natural Language Processing Techniques被引用数 5

ひとこと要約

本論文は CodePLAN を導入します。CodePLAN は、LLM から蒸留された高品質の解決計画を生成・使用するよう小型のコード生成モデルを訓練する多タスク蒸留フレームワークであり、APPS の pass@1 を130％を超える程度向上させます。

ABSTRACT

Large Language Models (LLMs) have recently made significant advances in code generation through the 'Chain-of-Thought' prompting technique. This technique empowers the model to autonomously devise "solution plans" to tackle intricate programming challenges, thereby improving its performance in code generation. Nevertheless, smaller models have been struggling to keep up with LLMs in deducing these plans, adversely affecting their code generation capabilities. Given the considerable size and associated deployment costs, along with concerns about data security, many teams opt for deploying smaller models for code generation. Consequently, there arises a compelling need for transferring LLMs' code generation reasoning abilities to the smaller models. In this paper, we propose the CodePLAN framework, which aims to transfer LLMs' reasoning capabilities to smaller models through distillation. We adopt a multi-task learning approach, jointly undertaking code generation and solution plan generation tasks, to enhance the code generation capabilities of the smaller model. To ensure the superior quality of the solution plans, we advocate for the utilization of backward reasoning and plan sampling strategies. Our experiments show that in comparison to the conventional fine-tuning approach, our approach improves the smaller model's code generation performance (measured in pass@1 metric) by over 130% on the challenging APPS benchmark.

研究の動機と目的

LLM に似たコード生成性能を、デプロイメントコストやデータセキュリティ上の懸念なしに達成できるよう、より小さなモデルを有効にする動機づけ。
LLM の推論を蒸留してコード生成と計画生成を同時に訓練する多タスクフレームワークを開発する。
訓練と推論を導くために backward reasoning と plan sampling によって計画品質を向上させる。
計画品質とサンプリングが下流のコード生成性能にどう影響するかを調査する。

提案手法

CodePLAN を提案する：基本モデルがコード生成と解決計画の生成の両方を学習する多タスクフレームワーク。
高品質な解決計画を提供する教師として LLM を用い、コード生成と計画生成の両方を最適化する結合損失で訓練する（L = (1-λ)L_code + λL_plan、λ=0.5）。
現実の解からプログラマーのような解決計画を抽出するための backward reasoning を導入し、前方の問題記述ベースの計画より計画品質を改善する。
推論時には計画サンプリングを適用して複数の候補計画を生成し、ユニットテストで評価して最良の計画を選択しコード生成を guiding。
基盤モデルに計画ヘッドを追加して計画を出力させ、コードタスクと計画タスクの間で交互のマルチタスク微調整を可能にする。
APPS と MBPP データセットで pass@k 指標を用いて評価し、標準的なファインチューニング、CoT、RL ベースのベースラインと比較。

実験結果

リサーチクエスチョン

RQ1LLM の解決計画を小型モデルに蒸留することは、コード生成性能を改善するか。
RQ2 Solutions からの backward reasoning は、問題記述からの forward 計画より高品質な計画信号を生むか。
RQ3推論時の計画サンプリングは、生成されるコードの品質と信頼性にどのように影響するか。
RQ4CodePLAN は異なるデータセットや問題の難易度に対して、標準的なファインチューニングと比較してどれくらい改善するか。

主な発見

CodePLAN は標準的なファインチューニングと比較して APPS の pass@1 で130％超の向上を達成する。
計画中心の蒸留は APPS および MBPP において、標準のファインチューニングおよび CoT ベースのアプローチより一貫して利益を生む。
backward reasoning は問題記述から直接得られる計画より高品質な計画を生み、下流のコード品質を向上させる。
計画サンプリング（複数の計画をユニットテストで評価すること）はコードの正確性を大きく高め、サンプリングが増えるほど利益は大きくなる（N=1 を超えて）。
CodePLAN は CodeRanker などのランキングベース後処理と相補的で、コード正確性に大きな総合的利益をもたらす。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。