QUICK REVIEW

[논문 리뷰] The Secret Sharer: Evaluating and Testing Unintended Memorization in Neural Networks

Nicholas Carlini, Chang Liu|arXiv (Cornell University)|2018. 02. 22.

Privacy-Preserving Technologies in Data인용 수 504

한 줄 요약

이 논문은 신경망 시퀀스 모델에서 드물거나 비밀인 훈련 데이터의 의도치 않은 암기를 정량화하고 제한하기 위한 노출 기반 테스트 방법론을 도입하고, Google Smart Compose와 같은 실제 시스템에서 그 실용성을 입증한다.

ABSTRACT

This paper describes a testing methodology for quantitatively assessing the risk that rare or unique training-data sequences are unintentionally memorized by generative sequence models---a common type of machine-learning model. Because such models are sometimes trained on sensitive data (e.g., the text of users' private messages), this methodology can benefit privacy by allowing deep-learning practitioners to select means of training that minimize such memorization. In experiments, we show that unintended memorization is a persistent, hard-to-avoid issue that can have serious consequences. Specifically, for models trained without consideration of memorization, we describe new, efficient procedures that can extract unique, secret sequences, such as credit card numbers. We show that our testing strategy is a practical and easy-to-use first line of defense, e.g., by describing its application to quantitatively limit data exposure in Google's Smart Compose, a commercial text-completion neural network trained on millions of users' email messages.

연구 동기 및 목표

생성적 시퀀스 모델에서 드물거나 비밀인 훈련 데이터의 암기를 평가하기 위한 정량적 노출 기반 지표를 개발한다.
학습 데이터에 카나리(canaries)를 삽입하고 훈련된 모델에서 노출을 측정하는 실용적인 테스트 방법론을 제공한다.
실제 시스템(예: Google's Smart Compose)에 이 방법론을 시연하여 프라이버시 보존 학습 선택을 안내한다.
모델과 학습 체계 전반에서 암기가 어떻게 발생하는지 조사하고 간단한 방어책과 차등 프라이버시를 비교 평가한다.

제안 방법

로그-퍼플렉시티를 시퀀스 가능도 척도로 정의하고 삽입된 카나리의 퍼플렉시티를 무작위 시퀀스와 비교한다.
모델의 퍼플렉시티 분포에서 카나리의 순위(또는 추측 엔트로피)로부터 파생된 노출 지표를 도입한다.
포맷 기반 카나리를 학습 데이터에 삽입하고 동일한 설정 하에 모델을 학습시켜 암기 효과를 측정한다.
샘플링이나 분포 모델링(예: 왜곡-정규 분포)을 통해 노출을 효율적으로 근사화하고 랭크 기반 노출을 추정한다.
실용적인 테스트 파이프라인을 제공한다: 데이터를 카나리로 보강하고 학습시키며 카나리 삽입 빈도의 함수로 노출 곡선을 보고한다.
대규모 생산 모델(예: Smart Compose)에 이 방법론을 적용하여 유용성 및 프라이버시 간의 균형을 검증한다.

Figure 1: Results of our testing methodology applied to a state-of-the-art, word-level neural-network language model [ 35 ] . Two models are trained to near-identical accuracy using two different training strategies (hyperparameters A and B). The models differ significantly in how they memorize a ra

실험 결과

연구 질문

RQ1신경망이 훈련 데이터에 드물게 등장하는 비밀 정보를 노출 지표로 측정할 때 암기를 기억할 수 있는가?
RQ2훈련 체계, 모델 규모, 데이터 분포에 따라 암기가 어떻게 달라지며 지나친 유용성 손실 없이 이를 완화할 수 있는가?
RQ3간단한 정규화(조기 종료, 드롭아웃)로 의도치 않은 암기를 방지하는 데 충분한가, 아니면 프라이버시 보존 학습 방법이 필요한가?
RQ4실무에서 차등 프라이버시가 암기 제거에 다른 방어책에 비해 얼마나 효과적인가?

주요 결과

의도치 않은 드물거나 비밀 훈련 데이터의 암기가 모델과 학습 전략 전반에서 흔하고 지속된다.
노출 기반 테스트 전략은 암기를 정량화하고 동일 정확도의 학습 접근 간 차이를 드러낼 수 있다.
학습 데이터에 삽입된 카나리는 블랙박스 질의 모델 하에서 효율적으로 추출되거나 높은 수준의 노출을 갖는다.
조기 종료와 드롭아웃은 의도치 않은 암기를 방지하기에 충분하지 않으며, 차등 프라이버시 학습은 암기를 제거할 수 있지만 유용성 비용이 따른다.
구글의 Smart Compose에 적용하여 노출 지표가 프라이버시 고려사항을 안내하고 데이터 노출 제한의 실용성을 입증했다.

Figure 3: Skew normal fit to the measured perplexity distribution. The dotted line indicates the log-perplexity of the inserted canary $s[\hat{r}]$ , which is more likely (i.e., has lower perplexity) than any other candidate canary $s[r^{\prime}]$ .

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.