QUICK REVIEW

[論文レビュー] Is ChatGPT Transforming Academics' Writing Style?

Mingmeng Geng, Roberto Trotta|arXiv (Cornell University)|Apr 12, 2024

Artificial Intelligence in Healthcare and Education被引用数 5

ひとこと要約

この論文は、ChatGPTの影響を受けた執筆スタイルを検出するために、百万件のarXiv要約を分析し、語頻度の変化を通じてChatGPTの影響をカテゴリと時点で推定する。CSは最も強い影響を示す（簡易プロンプトの下で約35%）。

ABSTRACT

Based on one million arXiv papers submitted from May 2018 to January 2024, we assess the textual density of ChatGPT's writing style in their abstracts through a statistical analysis of word frequency changes. Our model is calibrated and validated on a mixture of real abstracts and ChatGPT-modified abstracts (simulated data) after a careful noise analysis. The words used for estimation are not fixed but adaptive, including those with decreasing frequency. We find that large language models (LLMs), represented by ChatGPT, are having an increasing impact on arXiv abstracts, especially in the field of computer science, where the fraction of LLM-style abstracts is estimated to be approximately 35%, if we take the responses of GPT-3.5 to one simple prompt, "revise the following sentences", as a baseline. We conclude with an analysis of both positive and negative aspects of the penetration of LLMs into academics' writing style.

研究の動機と目的

ChatGPTがarXiv要約の学術執筆スタイルに影響を与えるかどうかを動機づけ、定量化する。
時間の経過に伴い、ChatGPTのような語頻の指紋を検出する統計的枠組みを開発する。
実データとChatGPTで修正した（シミュレーション）要約を用いて手法を較正・検証する。
分野と時間を超えて、ChatGPTに影響を受けたテキストの密度を推定する。
ChatGPTが学術執筆へ浸透することの意味・利点・リスクを論じる。

提案手法

語頻変化を測る指標R_iを定義する（式1）。
実要約を簡易プロンプトで磨くことによるChatGPT駆動のシミュレーションを用いて語変化率を推定する（式2）。
影響を受けた要約の割合を表すη_j(t)項でChatGPTの影響をモデル化する（式5）。
δ_ijによるノイズを取り込み、η_jを推定するバイアスを考慮した損失L_j,t(η_j)を構築する（式18–23）。
プロンプトと混合比を変えて語セットI_jを較正し、頑健性を検証する（式35–37）。
プレChatGPT期間を用いてf*_ij(t)を較正し、GPT-3.5駆動のシミュレーションで検証する（第4節）。

Figure 1: The 12 words with the highest change rate $R_{i}$ and satisfying $\max_{t}(f_{i}(t))>500$ . The vertical red dashed line demarcates the first time period after ChatGPT’s release.

実験結果

リサーチクエスチョン

RQ1語彙頻度における統計的署名は、ChatGPTの影響をarXiv要約に示すことができるか？
RQ2ChatGPTは分野全体および時間の経過とともに語の使用にどのような影響を及ぼすか？
RQ3特にCSで、ChatGPT風の執筆の密度はどの程度と見積もられるか？
RQ4異なるプロンプトや較正の選択に対して推定はどれほど頑健か？
RQ5語頻を用いたChatGPTの影響測定にはどのような制限と潜在的なバイアスがあるか？

主な発見

語	カテゴリ	前	後	変化
is	cs	2.01	1.73	-14%
is	math	1.78	1.61	-9%
is	astro	2.13	1.90	-11%
is	cond-mat	2.00	1.68	-16%
are	cs	1.00	0.83	-17%
are	math	0.74	0.71	-5%
are	astro	1.39	1.25	-1%
are	cond-mat	0.92	0.80	-13%
significant	cs	0.09	0.18	99%
significant	math	0.01	0.03	308%
significant	astro	0.17	0.26	53%
significant	cond-mat	0.07	0.18	171%

ChatGPT風のテキスト浸透はリリース後にarXiv要約で検出可能であり、コンピュータサイエンスで最も強い取り込みを示す。
CSにおけるChatGPTの影響は、単純なプロンプトのベースライン（“revise the following sentences”）を用いて約35%と見積もられる。
語頻度の変化は、トピックの動向（COVID-19、LLMs、AI）と非トピックの変化（例：機能語“are”/“is”）の両方を反映する。
“significant”のような語は、シミュレートされたChatGPT処理により複数のカテゴリ（CS、数学、天文学、条件物理学）で大幅に増加を示す。
較正ベースの透明な頻度分析アプローチは、ブラックボックス検出器に依存せずにChatGPTの影響を定量化できる。

Figure 2: Examples of words with rapidly growing frequency in arXiv abstracts.

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。