QUICK REVIEW

[논문 리뷰] Privacy Loss in Apple's Implementation of Differential Privacy on MacOS 10.12

Tang Jun, Aleksandra Korolova|arXiv (Cornell University)|2017. 09. 08.

Privacy-Preserving Technologies in Data참고 문헌 11인용 수 186

한 줄 요약

본 논문은 MacOS 10.12에서 Apple's local differential privacy 배치를 분석하여 데이터당 프라이버시 매개변수를 밝히지만 예산 관리와 이중 수준 예산 편성으로 인해 일일 전체 프라이버시 손실이 훨씬 더 크다는 것을 보인다.

ABSTRACT

In June 2016, Apple announced that it will deploy differential privacy for some user data collection in order to ensure privacy of user data, even from Apple. The details of Apple's approach remained sparse. Although several patents have since appeared hinting at the algorithms that may be used to achieve differential privacy, they did not include a precise explanation of the approach taken to privacy parameter choice. Such choice and the overall approach to privacy budget use and management are key questions for understanding the privacy protections provided by any deployment of differential privacy. In this work, through a combination of experiments, static and dynamic code analysis of macOS Sierra (Version 10.12) implementation, we shed light on the choices Apple made for privacy budget management. We discover and describe Apple's set-up for differentially private data processing, including the overall data pipeline, the parameters used for differentially private perturbation of each piece of data, and the frequency with which such data is sent to Apple's servers. We find that although Apple's deployment ensures that the (differential) privacy loss per each datum submitted to its servers is $1$ or $2$, the overall privacy loss permitted by the system is significantly higher, as high as $16$ per day for the four initially announced applications of Emojis, New words, Deeplinks and Lookup Hints. Furthermore, Apple renews the privacy budget available every day, which leads to a possible privacy loss of 16 times the number of days since user opt-in to differentially private data collection for those four applications. We advocate that in order to claim the full benefits of differentially private data collection, Apple must give full transparency of its implementation, enable user choice in areas related to privacy loss, and set meaningful defaults on the privacy loss permitted.

연구 동기 및 목표

Apple이 MacOS 10.12에서 로컬 차등 프라이버시를 어떻게 구현하는지 이해한다.
사용된 프라이버시 예산 관리 및 데이터당 프라이버시 매개변수를 식별한다.
데이터가 얼마나 자주 보고되는지와 시간이 지나면서 프라이버시 손실이 어떻게 누적되는지 평가한다.
프라이버시 예산 시스템의 투명성, 구성 가능성 및 잠재적 남용 벡터를 평가한다.

제안 방법

macOS Sierra (10.12) 구현에 대한 정적 및 동적 코드 분석.
DifferentialPrivacy.framework 및 dprivacyd 데몬의 디컴파일 및 추적.
프라이버시 예산 및 데이터 흐름 맵핑을 위해 데이터베이스 테이블, 구성 파일 및 보고 파일을 검사한다.
구성 매개변수의 실험적 조작으로 프라이버시 매개변수 및 예산 동작에 미치는 영향을 관찰한다.

실험 결과

연구 질문

RQ1비 프라이버시화 이전에 데이터 유형별로 어떤 데이터당 프라이버시 매개변수가 사용되는가?
RQ2보고를 위해 레코드가 얼마나 자주 선택되며, 보고당 최대 프라이버시 손실과 일일 최대 손실은 얼마인가?
RQ3기기당 총 프라이버시 손실이 시간에 따라 한정되는가, 무한한가?
RQ4파라미터 및 타이밍 조작에 시스템의 저항성은 어느 정도이며, 남용 벡터는 무엇인가?

주요 결과

데이터당 프라이버시 매개변수 값은 데이터 유형별로 정의되며(예: 이모지, 새 단어) 구성 파일의 값과 일치한다.
시스템은 BudgetKeyName당 예산 잔액(ZBALANCE)과 SessionSeconds당 증가하는 SessionAmount를 사용하여 예산 증가를 제어한다.
보고서 생성기가 각 KeyName당 최대 min(SessionAmount, 40)개의 레코드를 선택하며, 이용 가능한 예산 잔액에 의해 제약되어 활성 유형의 일일 프라이버시 손실은 PrivacyParameter × SessionAmount가 된다.
프라이버시 예산은 매 SessionSeconds마다 보충되며 사용되지 않은 예산은 이월되므로 초기 네 개 애플리케이션의 경우 시간이 지나도 무한한 총 프라이버시 손실이 발생할 수 있다.
초기 네 개 앱의 일일 허용 프라이버시 손실은 16에 이를 수 있으며, 예산 보충 메커니즘으로 인해 동의 날짜 이후 기기 전체 손실은 시간에 따라 확장될 수 있다.
구현에는(하드 코딩된 제한, 구성 변경의 어려움) 같은 안전장치가 포함되어 있지만 루트 접근이나 Apple의 향후 변경으로 예산이나 매개변수가 바뀔 경우 남용 가능성이 여전히 있다.
macOS 10.12.1과 10.12.3 구성 간 차이가 있으며, 특히 NewWords의 증가된 SessionAmount와 건강/로컬 단어 예산 추가, 그리고 SubmissionPriority가 있다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.