QUICK REVIEW

[論文レビュー] SoK: Memorization in General-Purpose Large Language Models

Valentin N. Hartmann, Anshuman Suri|arXiv (Cornell University)|Oct 24, 2023

Topic Modeling被引用数 9

ひとこと要約

この調査は、大規模言語モデルにおける記憶の種類の分類学を提示し、性能、プライバシー、セキュリティ、著作権、監査への影響を分析し、逐語的テキスト、事実、アイデアとアルゴリズム、文体、訓練分布の性質、整合性目標にわたる検出と緩和戦略を論じる。

ABSTRACT

Large Language Models (LLMs) are advancing at a remarkable pace, with myriad applications under development. Unlike most earlier machine learning models, they are no longer built for one specific application but are designed to excel in a wide range of tasks. A major part of this success is due to their huge training datasets and the unprecedented number of model parameters, which allow them to memorize large amounts of information contained in the training data. This memorization goes beyond mere language, and encompasses information only present in a few documents. This is often desirable since it is necessary for performing tasks such as question answering, and therefore an important part of learning, but also brings a whole array of issues, from privacy and security to copyright and beyond. LLMs can memorize short secrets in the training data, but can also memorize concepts like facts or writing styles that can be expressed in text in many different ways. We propose a taxonomy for memorization in LLMs that covers verbatim text, facts, ideas and algorithms, writing styles, distributional properties, and alignment goals. We describe the implications of each type of memorization - both positive and negative - for model performance, privacy, security and confidentiality, copyright, and auditing, and ways to detect and prevent memorization. We further highlight the challenges that arise from the predominant way of defining memorization with respect to model behavior instead of model weights, due to LLM-specific phenomena such as reasoning capabilities or differences between decoding algorithms. Throughout the paper, we describe potential risks and opportunities arising from memorization in LLMs that we hope will motivate new research directions.

研究の動機と目的

大規模言語モデルにおける記憶の包括的な分類学を、複数の情報タイプに跨って提供する。
記憶がモデルの性能、プライバシー、セキュリティ、著作権、監査に与える影響を論じる。
記憶の定義と測定の課題を特定し、検出と緩和のアプローチを概説する。
LLMにおける記憶の理解とガバナンスを進めるための未解決問題と研究方向を強調する。

提案手法

逐語的テキスト、事実、アイデアとアルゴリズム、文体、訓練分布の性質、整合性目標を含むLLMにおける記憶のタイプの分類学を提案する。
LLM、機械学習、プライバシー、セキュリティ、法学の文献を検討・統合して、記憶と性能、プライバシー、セキュリティ、著作権、監査の関連を整理する。
各記憶タイプの定義、検出方法、対策を論じる。
記憶と幻覚・推論を対比し、出力の出典を記憶された内容と一般化の区別を明確にする。
推論攻撃や分布推定などの測定上の課題を強調し、それらが記憶研究に与える影響を示す。

実験結果

リサーチクエスチョン

RQ1LLMが記憶する異なる情報タイプは何であり、それらはどのように定義され検出できるか？
RQ2各記憶タイプがモデルの性能、プライバシー、セキュリティ、著作権、監査に与える影響は何か？
RQ3プロンプト、デコード、モデル挙動の課題を踏まえ、記憶を実際にどのように測定・緩和・統治できるか？
RQ4逐語的テキストを超えた記憶を検討したとき、どのような未解決の研究方向が現れるか？

主な発見

逐語的テキストの記憶は一般的で、全文書から短いシーケンスや言い換えまで幅があり、検出と緩和の課題はデコードとプロンプトに結びつく。
記憶された事実には、世界知識や領域知識、個人を特定可能な情報（PII）を含むが、それらはタプル、KaRR風の指標、反事実的記憶などを用いて研究でき、知識の正確性とプライバシーに影響する。
アイデア、アルゴリズム、文体の記憶は一般化と移転を可能にする一方で、有害な內容の再現、文脈からの切り離し、または著作権上の懸念を生むリスクもある。
訓練分布の性質および整合性目標に関連する記憶は、学習の有効性、バイアス、安全性、および人間の好みやラベルの漏洩の可能性に影響する。
本論文は、記憶と推論・幻覚を区別する困難さを強調し、データセットの汚染やモデルの安全性の弱点を明らかにできる監査方法を提唱している。
重複排除、剪定、意味的レベルの訓練目標、ベンチマークにおけるカナリア、事後処理の安全策など、さまざまな検出および予防戦略が論じられている。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。