QUICK REVIEW

[論文レビュー] Claim Automation using Large Language Model

Zhengda Mo, Zhiyu Quan|arXiv (Cornell University)|Feb 18, 2026

Artificial Intelligence in Healthcare and Education被引用数 0

ひとこと要約

要約: 本論文は、ローカルに展開された、ガバナンスを意識したLLMパイプラインをLoRAで微調整し、保証請求の認証 Narrative から構造化された是正-action推奨を生成する。ドメイン適合型のLLMsと比較して、一般的なLLMsよりドメイン適合性で優れている。

ABSTRACT

While Large Language Models (LLMs) have achieved strong performance on general-purpose language tasks, their deployment in regulated and data-sensitive domains, including insurance, remains limited. Leveraging millions of historical warranty claims, we propose a locally deployed governance-aware language modeling component that generates structured corrective-action recommendations from unstructured claim narratives. We fine-tune pretrained LLMs using Low-Rank Adaptation (LoRA), scoping the model to an initial decision module within the claim processing pipeline to speed up claim adjusters' decisions. We assess this module using a multi-dimensional evaluation framework that combines automated semantic similarity metrics with human evaluation, enabling a rigorous examination of both practical utility and predictive accuracy. Our results show that domain-specific fine-tuning substantially outperforms commercial general-purpose and prompt-based LLMs, with approximately 80% of the evaluated cases achieving near-identical matches to ground-truth corrective actions. Overall, this study provides both theoretical and empirical evidence to prove that domain-adaptive fine-tuning can align model output distributions more closely with real-world operational data, demonstrating its promise as a reliable and governable building block for insurance applications.

研究の動機と目的

保険数理ワークフロー内で未構造の請求 narratives を実行可能で構造化された是正出力に統合する。
データ感度と規制制約に対応する、ガバナンス意識を持つローカル展開LLMフレームワークを開発する。
ドメイン特化の微調整が出力分布を現実の請求処理慣行に合わせて再形成することを示す。
自動的な意味論的指標と人間評価を組み合わせた多次元評価フレームワークを提供する。

提案手法

データガバナンスを確保するためにオンプレミスで展開されたデコーダーのみのTransformer（DeepSeek-R1-Distill-Llama-8B）を使用。
Transformerブロックの選択された射影に低ランクアダプタを挿入することでLoRAを用いて微調整。
補正アクションセグメントだけを最適化するマスキング自己回帰目的で訓練（苦情–原因を入力として、是正を出力とする）。
注意機構内の位置情報にはRotary Position Embedding（RoPE）を適用。
PreNorm Transformerフレームワーク内でマルチステージ正規化（RMSNorm）とSwiGLU活性化を採用。
意味論的類似性指標と構造化出力検証を組み合わせた多次元評価フレームワークと、人間のフィードバックを取り入れたループを用いて評価する。

Figure 1 : Overview of the token-level generation architecture used for claim automation.

実験結果

リサーチクエスチョン

RQ1ドメイン適合型のローカル展開LLMは、現実の請求処理慣行と一致する構造化された是正アクション出力を生成できるか？
RQ2ドメイン特化の保証データに対するLoRA微調整は、出力形式・意味論・安定性の点で汎用LLMsを上回るか？
RQ3中間タスク（是正アクション出力）をモジュール化することは、請求ワークフローにおけるガバナンス、透明性、監査可能性にどう影響するか？
RQ4言語ベースの請求自動化の実務的有用性と予測精度を最もよく捉える評価フレームワークはどれか？
RQ5ドメイン整合性が、観測される請求プロセスに対する出力分布に与える影響はどの程度か？

主な発見

ドメイン特化の微調整は、商用の汎用・プロンプトベースLLMに対して著しく性能を向上させる。
評価ケースの約80%で、 ground-truth の是正アクションにほぼ同一の一致を達成。
ガバナンス制約を伴うローカル展開は、データプライバシーと規制リスクを低減しつつ、再現性と監査可能性を高める。
DeepSeek-R1モデルへのLoRA適用は、現実の請求処理慣行と整合する出力分布へ再形成する。
多次元評価アプローチは、出力の構造的妥当性、意味的整合性、分布的一貫性を効果的に評価できる。

Figure 2 : LoRA adaptation applies to a single projection matrix. The original weight matrix $W_{\mathrm{frozen}}\in\mathbb{R}^{d_{\mathrm{out}}\times d_{\mathrm{in}}}$ remains unchanged, while trainable matrices $A\in\mathbb{R}^{r\times d_{\mathrm{in}}}$ and $B\in\mathbb{R}^{d_{\mathrm{out}}\times

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。