QUICK REVIEW

[論文レビュー] A Survey on Large Language Model Impact on Software Evolvability and Maintainability: the Good, the Bad, the Ugly, and the Remedy

Bruno Claudino Matias, Savio Freire|arXiv (Cornell University)|Jan 26, 2026

Software Engineering Research被引用数 0

ひとこと要約

要約: 本論文はLLMsがソフトウェアの保守性と evolvability に与える影響を体系的に検討し、構造化された文献総説を通じて利点・リスク・弱点・緩和戦略を整理する。

ABSTRACT

Context. Large Language Models (LLMs) are increasingly embedded in software engineering workflows for tasks including code generation, summarization, repair, and testing. Empirical studies report productivity gains, improved comprehension, and reduced cognitive load. However, evidence remains fragmented, and concerns persist about hallucinations, unstable outputs, methodological limitations, and emerging forms of technical debt. How these mixed effects shape long-term software maintainability and evolvability remains unclear. Objectives. This study systematically examines how LLMs influence the maintainability and evolvability of software systems. We identify which quality attributes are addressed in existing research, the positive impacts LLMs provide, the risks and weaknesses they introduce, and the mitigation strategies proposed in the literature. Method. We conducted a systematic literature review. Searches across ACM DL, IEEE Xplore, and Scopus (2020 to 2024) yielded 87 primary studies. Qualitative evidence was extracted through a calibrated multi-researcher process. Attributes were analyzed descriptively, while impacts, risks, weaknesses, and mitigation strategies were synthesized using a hybrid thematic approach supported by an LLM-assisted analysis tool with human-in-the-loop validation. Results. LLMs provide benefits such as improved analyzability, testability, code comprehension, debugging support, and automated repair. However, they also introduce risks, including hallucinated or incorrect outputs, brittleness to context, limited domain reasoning, unstable performance, and flaws in current evaluations, which threaten long-term evolvability. Conclusion. LLMs can strengthen maintainability and evolvability, but they also pose nontrivial risks to long-term sustainability. Responsible adoption requires safeguards, rigorous evaluation, and structured human oversight.

研究の動機と目的

LLM の利用がソフトウェア工学タスクにおいて保守性と evolvability の属性のどれに影響を与えるかを特定する。
LLMs がソフトウェアの持続可能性（良い点）に与える肯定的な影響を要約する。
保守性と evolvability に対する LLMs の弱点とリスクを特定する（Bad and Ugly）。
識別された弱点に対処する緩和戦略を提案する。

提案手法

2020年から2024年末までの研究を対象とする系統的文献調査を実施。ACM DL、IEEE Xplore、Scopusを跨いで検索。
711件の研究をスクリーニングし、87件の主要論文を深堀分析に選定。
独立コーディングと裁定を伴う calibrated な複数研究者データ抽出プロセスを適用。
人間-in-ザループ検証とLLM支援分析ツール（ThemeCrafter）を用いたハイブリッドな定性的統合を活用。
定性的コーディングをISO/IEC 9126およびBreivoldらの evolvability 属性に基づかせ、追跡性のため引用を保持する。

実験結果

リサーチクエスチョン

RQ1RQ1: ソフトウェア工学タスクにおいて LLM の使用がどの保守性および evolvability 属性に影響を与えるか？
RQ2RQ2: LLM がソフトウェアの evolvability および maintainability に与える肯定的な影響は何か？
RQ3RQ3: evolvability および maintainability に関して LLM が示す弱点は何か？
RQ4RQ4: ソフトウェア evolvability および maintainability に関して LLM に存在する弱点は何か？
RQ5RQ5: LLM の弱点を緩和し evolvability および maintainability をより良く支援するにはどうすればよいか？

主な発見

LLMs は分析性、テスト性、コード理解、デバッグ支援、自動修正の有効性を高める可能性がある。
LLMs は幻覚、文脈感度の脆さ、領域特異的推論の限界、不安定なパフォーマンス、評価の欠陥といったリスクを導入する。
現時点の証拠は、長期的な evolvability と持続可能性を脅かす構造的な弱点を示している。
緩和戦略にはハイブリッドパイプライン、人間-in-the-loop の検証、プロンプトエンジニアリング、ガードレールが含まれる。
本研究は LLM 支援型 SE 活動と関連する保守性および evolvability 属性のマッピングを提供する。
識別されたリスクと弱点に対応するための提案緩和戦略の総覧が示されている。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。