QUICK REVIEW

[論文レビュー] How critically can an AI think? A framework for evaluating the quality of thinking of generative artificial intelligence

Luke Zaphir, Jason M. Lodge|arXiv (Cornell University)|Jun 20, 2024

Explainable Artificial Intelligence (XAI)被引用数 5

ひとこと要約

本論文は、生成系AIへの脆弱性を評価するためのMAGEフレームワークを提案し、批判的思考タスクを評価・改善するための学問分野別の評価設計を指導する。

ABSTRACT

Generative AI such as those with large language models have created opportunities for innovative assessment design practices. Due to recent technological developments, there is a need to know the limits and capabilities of generative AI in terms of simulating cognitive skills. Assessing student critical thinking skills has been a feature of assessment for time immemorial, but the demands of digital assessment create unique challenges for equity, academic integrity and assessment authorship. Educators need a framework for determining their assessments vulnerability to generative AI to inform assessment design practices. This paper presents a framework that explores the capabilities of the LLM ChatGPT4 application, which is the current industry benchmark. This paper presents the Mapping of questions, AI vulnerability testing, Grading, Evaluation (MAGE) framework to methodically critique their assessments within their own disciplinary contexts. This critique will provide specific and targeted indications of their questions vulnerabilities in terms of the critical thinking skills. This can go on to form the basis of assessment design for their tasks.

研究の動機と目的

生成AIが認知技能を模倣する能力と限界を評価する必要性を動機づける。
学問分野の文脈内で評価を体系的に批評するためのフレームワークを提供する。
AIによる生成に対して頑健な評価設計のための実行可能な指針を提供する。
デジタル評価における公平性、学術的誠実さ、および著者権関連の懸念に対処する。

提案手法

MAGEフレームワークを提案する：質問のマッピング、AI脆弱性テスト、評価、採点。
現時点の業界ベンチマークとしてChatGPT-4を使用し、批判的思考タスクにおけるAI能力を検証する。
質問を潜在的なAI脆弱性にマッピングし、回答を採点・評価する手順を概説する。
AI脆弱性の発見を解釈するための分野別コンテキストに特化した指針を提供する。

実験結果

リサーチクエスチョン

RQ1ChatGPT-4のような生成系AIの脆弱性に対して、評価をどのように体系的に批評できるか？
RQ2学問分野の課題内でAIの批判的思考の質を示す指標は何か？
RQ3MAGEフレームワークはAI脆弱性を軽減するような評価設計をどのように情報提供できるか？
RQ4デジタル評価文脈での公平性、誠実性、著者権はどのような考慮事項を生じさせるか？

主な発見

本論文は、AI脆弱性を評価するための方法としてMAGEフレームワークを提供する。
フレームワークは、批判的思考スキルの観点から、質問の脆弱性を対象的に示すことを可能にする。
このアプローチは、分野別の文脈内での評価設計の改善を支援する。
フレームワークは、デジタル評価における公平性、学術的誠実さ、および著者権に関連する懸念を強調する。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。