QUICK REVIEW

[Paper Review] A generic framework for privacy preserving deep learning

Théo Ryffel, Andrew Trask|arXiv (Cornell University)|Nov 9, 2018

Cryptography and Data Security2 references341 citations

TL;DR

The paper presents PySyft-based abstractions to enable privacy preserving deep learning (Federated Learning, Secure MPC, and Differential Privacy) within a PyTorch-like API, including a tensor chain model and SPDZ-based MPC and DP integration, with experimental results on Boston Housing and Pima Indians Diabetes datasets.

ABSTRACT

We detail a new framework for privacy preserving deep learning and discuss its assets. The framework puts a premium on ownership and secure processing of data and introduces a valuable representation based on chains of commands and tensors. This abstraction allows one to implement complex privacy preserving constructs such as Federated Learning, Secure Multiparty Computation, and Differential Privacy while still exposing a familiar deep learning API to the end-user. We report early results on the Boston Housing and Pima Indian Diabetes datasets. While the privacy features apart from Differential Privacy do not impact the prediction accuracy, the current implementation of the framework introduces a significant overhead in performance, which will be addressed at a later stage of the development. We believe this work is an important milestone introducing the first reliable, general framework for privacy preserving deep learning.

Motivation & Objective

Introduce a standardized protocol to enable Federated Learning between workers.
Propose a chain abstraction (SyftTensor) for tensors to support privacy-preserving operations.
Implement MPC (SPDZ) and Differential Privacy within the framework.
Provide an extensible architecture where new FL, MPC, or DP methods can be plugged in.
Demonstrate feasibility with preliminary experiments on standard datasets.

Proposed method

Define a tensor chain abstraction (SyftTensor) with LocalTensor and PointerTensor to enable chained operations and remote data sharing.
Implement virtual and networked workers (Virtual, Socket, WebSocket) to simulate and deploy FL scenarios.
Develop an MPCTensor using SPDZ protocol with fixed-precision encoding for MPC compatibility.
Incorporate Differential Privacy with gradient clipping and Gaussian noise using a federated learning-aware SGD procedure.
Use a FixedPrecisionTensor to adapt floating point data for integer-based MPC computations.
Provide DP accounting and sanitization compatible with existing privacy analyses (per referenced work).

Experimental results

Research questions

RQ1Can a unified tensor-chain framework support privacy-preserving techniques (FL, MPC, DP) with a PyTorch-like API?
RQ2What are the performance and accuracy trade-offs when applying MPC and DP in federated learning within this framework?
RQ3How can virtual workers and different network backends (Socket, WebSocket) facilitate debugging and browser-based experimentation for FL?
RQ4How effective are SPDZ-based MPC and Gaussian-noise DP in preserving privacy while maintaining model quality on standard datasets?

Key findings

The framework enables privacy-preserving deep learning techniques (FL, MPC, DP) within PyTorch workflows.
Training with DP incurs overhead (DP: 15.3 s vs. 10.1 s for non-DP in Boston Housing), and accuracy/mean-squared error are affected by privacy parameters.
Experiments on Boston Housing and Pima Indian Diabetes show DP can achieve privacy guarantees (0.5, 1e-5) with MSE around 29-30 and Pima accuracy around 60-70% across settings.
WebSocket-based and Virtual Workers offer acceptable overheads, validating notebook-based experimentation for FL.
SPDZ-based MPC supports secure shares and fixed-precision encoding, enabling MPC-enabled neural computation within the framework.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.