QUICK REVIEW

[논문 리뷰] Label Leakage and Protection in Two-party Split Learning

Oscar Li, Jiankai Sun|arXiv (Cornell University)|2021. 02. 17.

Privacy-Preserving Technologies in Data참고 문헌 34인용 수 64

한 줄 요약

본 논문은 2-파티 분할 학습에서 라벨이 아닌 파티가 역방향 그래디언트에서 비공개 라벨을 복구할 수 있음을 보이고, 유용성을 유지하면서 이 누출을 방지하는 최적화된 섭동 방법인 Marvell를 도입한다.

ABSTRACT

Two-party split learning is a popular technique for learning a model across feature-partitioned data. In this work, we explore whether it is possible for one party to steal the private label information from the other party during split training, and whether there are methods that can protect against such attacks. Specifically, we first formulate a realistic threat model and propose a privacy loss metric to quantify label leakage in split learning. We then show that there exist two simple yet effective methods within the threat model that can allow one party to accurately recover private ground-truth labels owned by the other party. To combat these attacks, we propose several random perturbation techniques, including $ exttt{Marvell}$, an approach that strategically finds the structure of the noise perturbation by minimizing the amount of label leakage (measured through our quantification metric) of a worst-case adversary. We empirically demonstrate the effectiveness of our protection techniques against the identified attacks, and show that $ exttt{Marvell}$ in particular has improved privacy-utility tradeoffs relative to baseline approaches.

연구 동기 및 목표

이진 분류(binary classification) 하에서 2-파티 분할 학습에서 라벨 누출에 대한 위협 모델을 정형화한다.
역방향 그래디언트에서 라벨 누출을 측정하기 위한 프라이버시 정량화 지표(leak AUC)를 제안한다.
그래디언트로부터 실제 라벨을 복구하는 현실적인 공격을 시연한다.
Marvell를 포함한 누출 최소화 보호 기법을 개발·평가하고, 모델 성능을 유지한다.

제안 방법

세로(vertical) 분할 학습에서 예시별 그래디언트 기반 누출에 대한 위협 모델을 정의한다.
잘라진 계층의 그래디언트로부터 라벨을 복구하는 adversary의 능력을 정량화하기 위한 지표로 leak AUC를 도입한다.
노름 기반(norm-based) 및 방향성/코사인 기반(dir/cosine-based) 누출 전략의 두 가지 실용적 공격을 시연한다.
무작위 섭동 방어를 제안한다: max_norm(heuristic)와 Marvell(principled optimization).
Marvell은 그래디언트-노이즈 파워 제약 하에서 worst-case AUC를 최소화하는 제약 조건 최적화를 해결한다.
클래스 의존 공분산을 갖는 가우시안 섭동을 가정하고 공분산 구조를 네 파라미터 문제로 축소한다.

실험 결과

연구 질문

RQ12-파티 분할 학습의 backward-cut-layer 그래디언트가 비공개 라벨을 드러낼 수 있는가?
RQ2이 설정에서 라벨 누출을 어떻게 정량화하고 비교할 수 있는가?
RQ3모델 유용성을 해치지 않으면서 누출을 효과적으로 줄일 수 있는 방어책은 무엇인가?
RQ4잠재적 악의적 공격자를 가정한 robust하고 원리적인 프라이버시 보호를 제공하는 최적화된 섭동 전략이 존재하는가?

주요 결과

두 가지 간단하고 현실적인 공격이 cut-layer 그래디언트로부터 비밀 라벨을 회수할 수 있어 높은 leak AUC 값을 초래한다.
Marvell은 누출을 크게 감소시키고, baselines에 비해 프라이버시-유틸리티의 유리한 트레이드오프를 달성한다.
그래디언트 누출은 더 이른 층(컷 레이어 이전)에서도 지속되며, Marvell에 의해 완화된다.
Marvell은 전체 노이즈 예산하에서 클래스 의존 가우시안 노이즈를 최적화하여 worst-case AUC를 최소화한다.
Criteo, Avazu, ISIC에 대한 경험적 평가가 노름 및 코사인 공격에 대한 보호 효과를 입증한다.
baseline과 비교하여 Marvell은 모델 성능을 유지하면서 더 강한 프라이버시를 제공한다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.